Compiler bug or feature?

Vogel · May 8, 2019, 1:38am

Why can I do this?:

main.swift:

class A {
    init() { }
}

b() //crashes at runtime of course, because c is uninitialized

var c = A()

some other file:

func b() {
    print(c)
}

Why is this allowed?

Jens · May 8, 2019, 4:13am

I think this old thread is related.

It includes similar (single file) examples like:

func foo() -> Int {
    return x
}
let x = foo()
print(x)

which still compiles, and prints 0.
And if you change Int to [Int] it will crash at runtime.

Here's one of the replies in that old thread:

Some possibly related bugs:

github.com/apple/swift

[SR-2730] Failure to initialize global variable

opened 05:07PM - 22 Sep 16 UTC

closed 06:55PM - 22 Sep 16 UTC

e2652f27-fdca-469d-a3e9-bf38e5bd1773

bug compiler

| | | |------------------|-----------------|… |Previous ID | SR-2730 | |Radar | None | |Original Reporter | @jadengeller | |Type | Bug | |Status | Resolved | |Resolution | Duplicate | <details> <summary>Environment</summary> Swift 3 </details> <details> <summary>Additional Detail from JIRA</summary> | | | |------------------|-----------------| |Votes | 0 | |Component/s | Compiler | |Labels | Bug | |Assignee | None | |Priority | Medium | md5: 61eabde96dfa157dcf6729f925cf4571 </details> **duplicates**: * [SR-2727](https://bugs.swift.org/browse/SR-2727) Script mode rules for top-level bindings are confusing and broken **Issue Description:** The following code snippet successful compiles and runs without infinite looping. ``` java func foo() -> Int { return x } let x = foo() ``` It seems that \`x\` is uninitialized when returned from \`foo\` so \`x\` is assigned to an uninitiated value. In this example, printing \`x\` will output \`0\`, but if we change \`Int\` to a more complicated type like \`\[Int\]\`, printing will crash the program for obvious reasons. I would expect this code to not compile since \`x\` is declared **after** \`foo\`. If it is intended to compile, then I think it ought to infinite loop. I definitely think the former behavior is much more intuitive than the latter.

github.com/apple/swift

[SR-2727] Script mode rules for top-level bindings are confusing and broken

opened 01:56PM - 22 Sep 16 UTC

closed 06:55PM - 22 Sep 16 UTC

swift-ci

bug compiler

| | | |------------------|-----------------|… |Previous ID | SR-2727 | |Radar | None | |Original Reporter | netfnet (JIRA User) | |Type | Bug | |Status | Resolved | |Resolution | Duplicate | <details> <summary>Additional Detail from JIRA</summary> | | | |------------------|-----------------| |Votes | 0 | |Component/s | Compiler | |Labels | Bug | |Assignee | None | |Priority | Medium | md5: 051659726cbef6b8c0f2029b8342062b </details> **duplicates**: * [SR-284](https://bugs.swift.org/browse/SR-284) Strange error caused by let/var at specific place in main file. **is duplicated by**: * [SR-2730](https://bugs.swift.org/browse/SR-2730) Failure to initialize global variable **Issue Description:** Dear Swift Team! I have a very simple project. It is Command Line Tool written on Swift 3.0 using Xcode 8.0. This program is: ``` java import Foundation func aaa() { print(a) } let a = "a" aaa() ``` This is working perfectly well and printing "a" in console, but lets do this program more complex: ``` java import Foundation func aaa() { print(a) print(b) } let a = "a" let b = "b" aaa() ``` And line `print(b)` is marked with error: `Use of unresolved identifier 'b'` We can make even easier: ``` java import Foundation func aaa() { print(a) } aaa() let a = "a" ``` And again, line `print(a)` is marked with error: `Use of unresolved identifier 'a'` I am not newbie and I undertand that I can easily fix this error like putting all variables in the beginning of the program. Question is: why is it happening? I thought each file with extension .swift, it is a class and I can put variable and functions, call functions in any order (all variables and constants would be global)... And one last thing, I don't have ability to test this on Swift 2.2, but I don't remember I faced this bug before, so can it be a error of Swift 3.0 compiler? Thank you for any answer!

github.com/apple/swift

[SR-284] Strange error caused by let/var at specific place in main file.

opened 09:50PM - 17 Dec 15 UTC

closed 04:55PM - 05 Dec 19 UTC

jepers

bug compiler

| | | |------------------|-----------------|… |Previous ID | SR-284 | |Radar | rdar://23702526 | |Original Reporter | @jepers | |Type | Bug | |Status | Resolved | |Resolution | Done | <details> <summary>Additional Detail from JIRA</summary> | | | |------------------|-----------------| |Votes | 0 | |Component/s | Compiler | |Labels | Bug | |Assignee | None | |Priority | Medium | md5: 49c6b20edc7481c19a41e118ce88ca23 </details> **is duplicated by**: * [SR-2727](https://bugs.swift.org/browse/SR-2727) Script mode rules for top-level bindings are confusing and broken * [SR-7658](https://bugs.swift.org/browse/SR-7658) "not declared" error for declarations after statements in binary mode is confusing **relates to**: * [SR-11534](https://bugs.swift.org/browse/SR-11534) Having a global variable in a wrong place causes a misleading error message **Issue Description:** The unrelated constant notEvenUsed somehow causes the error on struct S. The error will disappear if the constant is moved to another line or removed. (This happens only in the main file.) ``` none protocol P { func +(lhs: Self, rhs: Self) -> Self } struct S : P { // Error: Type 'S' does not conform to protocol 'P' var v: Int } let notEvenUsed = 1 // Move or comment out this line to remove above error. func +(lhs: S, rhs: S) -> S { return S(v: lhs.v + rhs.v) } ```

John_McCall · May 8, 2019, 4:23am

Right. What we should probably do is recognize that script variables are special and either

do a complex cross-file interprocedural definitive-initialization analysis to statically prove that uses of script variables are only evaluable after they are initialized by the main script flow,
unconditionally ban using script variables (or anything which might reference one) from secondary files, or
make references from outside of the main script flow (i.e. from secondary files or non-top-level code in the script file) fail dynamically if evaluated before the script has initialized the variable according to normal definitive initialization rules.

The first feels like it'd be a very poor use of implementation and maintenance effort. The second is probably excessively onerous. The last sounds pretty reasonable.

Vogel · May 8, 2019, 4:36am

But wouldn't the last option still let my example compile? I think I like your second option more.

John_McCall · May 8, 2019, 5:19am

The problem with the second one is that it means you can only break a script up into totally self-contained chunks because nothing can refer back into the script. In practice, I think that would discourage breaking up scripts at all.

For example, it's pretty common for script files to begin with a lot of configuration code, then define some types and functions that implement most of the core logic, then end with something that kicks off the basic behavior of the script. Something like this:

let useCrustyMagic = commandLine.has("--crusty")

...

func coreLogic() {
  if useCrustyMagic {
    // the crusty way
  } else {
    // the normal way
  }
}

...

coreLogic()

The most common reason that scripts start getting split across multiple files is that the core logic has gotten too big. But that core logic almost certainly depends on the configuration variables, so the second option means that all the global configuration has to be split into separate files first. And since that means the global configuration options have to be defined in a non-script file, they can't be initialized script-style, which means all they all have to be declared separately from the parsing code and they all have to have default values. So while I don't love the idea of script variables being usable from secondary files, I'm not sure the alternative is better.

anandabits · May 8, 2019, 2:12pm

Is there a reason it wouldn’t be acceptable to explicitly pass the configuration into the core logic?

jrose · May 8, 2019, 3:47pm

I'm still hopeful for doing #2, but I agree that it might be too onerous.

John_McCall · May 8, 2019, 4:27pm

Of course that’s the better-abstracted library design, but we’re talking about a script here. I’m saying there needs to be a middle-ground between “I have a single-file script” and “I have a set of well-composed libraries.”

Andrew_Trick · May 8, 2019, 4:48pm

This isn't just a problem with variable initialization. It also causes tremendous confusion whenever developers write small scripts or playgrounds to experiment with Swift semantics or performance without realizing that script globals differ from locals in many subtle ways.

The obvious expectation of casual scripters, which makes up the vast majority of Swift scripts, is that script variables are local variables.

In the much less common case that scripts become bona fide software projects, it makes perfect sense to explicitly declare any configuration variables as global variables using "internal", "public", or some other qualifier of choice.

This problem is only going to get worse over time. Globals are a huge thorn in the side of compiler diagnostics and optimization logic.

John_McCall · May 8, 2019, 7:53pm

If we can make script files totally independent without seriously regressing the scripting experience, that would be great. I'm just worried about talking as if we're going to spend a huge amount of time refining this area. A dynamic check for initialization is a comparatively self-contained way of eliminating a major soundness hole.

masters3d · May 9, 2019, 4:22am

This totally took me back to JavaScript land and variable hoisting. Would you say that the solution is to wrap all file scripts into do blocks?

//main.swift
do {

//.... my script scope

}

John_McCall · May 9, 2019, 5:27am

The current semantics of scripts are that top-level variables from the script are visible in secondary files. That's just how it is. We can consider changing those semantics to make top-level script variables local to the script file, and that very well might be the right thing to do, although I think it's a more complex question than it's getting credit for. But I don't think it's justifiable to do that as a bug fix, which is to say, without going through evolution and applying it unconditionally in all language modes. It's a potentially serious source-compatibility break and needs to be handled with the normal evolution process, which is to say, it needs a proposal, and that proposal will only take full effect in a future language mode.

But we don't need an evolution proposal to fix the implementation so that the current rules are at least dynamically sound.

porglezomp · May 17, 2019, 9:46pm

Could the semantics be specified such that all top-level variables are set up as globals which will be lazily initialized, and the main function simply forces their initialization in the order they're defined? The example above:

class A {
    init() { }
}

b() //crashes at runtime of course, because c is uninitialized

var c = A()

could be defined as something like:

class A {
    init() { }
}

var c = A()

// just to illustrate that c acts like global
// not the current way script variables work
main {
    b() // makes c initialize early
    let _ = c // would initialize c if it wasn't already initialized
}

This would leave all currently correct programs behaving the same way (since they never access variables out-of-order) but would make these variables behave much more consistently like other globals do.

The downside being that if you don't want to do this, you can still accidentally write code with confusing execution order.

(I like this solution because I think it would let me compile a script file as a bundle and just ignore its main function and have everything still work well.)

John_McCall · May 18, 2019, 4:44am

Variables aren't necessarily initialized directly, and even if they are, the initializer might have side-effects, or its correct value might depend on values that would require running the script up to that point to compute. So actually using those semantics could very easily be extremely confusing.

masters3d · June 28, 2020, 6:56am

@Paul_Hudson CC https://twitter.com/twostraws/status/1276590212844597248?s=20

tera · March 27, 2024, 8:13pm

Was beaten by this today:

func mainProc() { bar() }
func bar() { baz() }
func baz() { foo() }

mainProc()

var x: [Int] = []

func foo() {
    print(x.count) // 💣 crash: Thread 1: EXC_BAD_ACCESS (code=1, address=0x10)
}

I thought that so long as "var x = ..." is declared above it's usage in "foo" (in this case this is the only usage of "x") it should be fine, but, alas, "var x" has to be declared before "mainProc()" callout. Not very obvious IMHO. FWIW this was tested in Debug.

wadetregaskis · March 27, 2024, 8:57pm

Huh, that smells more like Python behaviour than Swift. I'm aware that global variables have some weird behaviours in Swift, but I thought these kind of basic issues were fixed already.

Not that the order of declarations should never matter - Swift very deliberately chooses not to care about order, unlike its predecessors (C/C++ et al). Because it's so much nicer that way, and eliminates a whole class of source code structuring problems (A is referenced by B which is referenced by C which is referenced by A, so now I have to play games with type / function declarations instead of definitions - thereby complicating the language with the ability to even have that distinction, etc…).

tera · March 29, 2024, 12:24am

Same example with a simple type is even more dangerous: in this case there's no crash or other warning and the app silently using a wrong value for the variable:

func mainProc() { bar() }
func bar() { baz() }
func baz() { foo() }

mainProc()

var x: Int = 42

struct S {
    static var y: Int = 42
}

func foo() {
    print(x)    // 0
    print(S.y)  // 42
}

If I split the declaration and initialisation:

...
var x: Int
x = 42
func foo() {
    print(x)    // 0
}

It's the same result and if I move x = 42 to after func foo:

...
var x: Int
func foo() {
    print(x)    // 🛑 Variable 'x' used by function definition before being initialized
}
x = 42

Compiler complains with x used before being initialised... meaning that in previous examples compiler thinks the variable is initialised ... yet using the wrong value.

and here:

func foo() {
    print(x)    // 0
}
var x: Int = 42

there's no complaint about variable being used before being initialised, just using the wrong value.

I'd say if we can't fix it properly we should consider detecting such instances and issuing a "this is not currently supported" compilation error.