Hey folks, I'm back again with another proposal to further tighten up the rules around forward references to variables (previous one here). This one concerns variables at the top level of a script file (main.swift).
Today top-level variables in script files are treated inconsistently in the language, the variable itself has global scope, but its initialization is local to the implicit main function. This inconsistency means you can currently use the variable before initialization and access uninitialized memory, e.g:
print(c) // CRASH
let c: C
class C {}
This is a pretty serious hole in the memory safety guarantee of the language. Unfortunately it's a non-trivial problem to fix in the general case because you can also access these variables in other files in the same module, so there's no way to completely enforce that the variable is initialized before use without also changing how these "script variables" are modeled in the language. Ideally they would always be treated as local (as if they were implicitly wrapped in a do {} block), but that would be quite source breaking. Some previous discussion here.
However, I think we ought to at least be able to start diagnosing on use-before-declarations for the single-file case, which should allow us to catch a good chunk of the unsound cases here. I should note that use-before-inits that occur after the declaration are already caught by definite initialization in many cases, e.g:
class C {}
let c: C
print(c) // error: constant 'c' used before being initialized
So the single-file unsound cases most often occur with forward references. As such, I'm proposing that a reference to a script variable should be diagnosed as invalid if:
- The variable is stored
- The reference occurs before the variable's declaration
- The reference also occurs directly at the top-level, i.e not nested in a closure or function
The last rule is unfortunately necessary to mitigate the source compatibility impact. However I would also like to propose that references that meet the first 2 rules will still be diagnosed, but will be downgraded to warnings (ideally we would upgrade them to errors in a future language mode though). An equivalent forward reference in a local function is already invalid.
The restriction on stored variables is also an effort to avoid source breakage, if we were to consistently treat computed script variables as local then forward references would be invalid too, but since these are never unsound and already quite widespread, I'm not proposing diagnosing on these at all for now. We can revisit this when we evaluate the broader language change for script variables.
I have opened a PR that implements this set of rules: [Sema] Diagnose likely-unsound script var forward references by hamishknight · Pull Request #88082 · swiftlang/swift · GitHub
Overloaded cases
One subtle impact of this change is when a forward-referenced script variable is overloaded with a function decl, e.g:
foo(0)
func foo(_ x: Int) {}
let foo: (Int) -> Void = {_ in}
Today foo refers to the variable, but with the proposed change it would switch to the function. This is unfortunate but I expect it to be extremely rare (it hasn't yet appeared in the source compatibility testing I've done). The alternative would be always error if an overload set contains a forward-referenced variable, but this would be even more restrictive than the rules in a local context such as a function.
Source compatibility
The source compatibility testing I've done so far across both the source compatibility suite and internal Swift projects at Apple has only encountered a single sound case broken by this change. Such sound cases can occur when you also initialize the variable before its declaration, e.g:
x = 0
print(x)
let x: Int
However this has proven to be extremely rare. I also did not encounter a single overloaded case broken by this change.
Why not diagnose in SIL?
As mentioned previously, definite initialization is already able to catch most use-before-initializations for script variables provided that the use actually follows the declaration. In principe, we could adjust the SIL logic here such that we're also able to diagnose use-before-inits before the declaration too. This would enable us to avoid breaking source for sound cases such as the above where the variable is also initialized before use.
However the fact that we allow these forward references is also generally problematic in type-checker, e.g:
print(a)
guard let b = 3 as Int? else {}
let a = b
This currently crashes the compiler since the forward reference to a causes the type-checker to lazily type-check the binding, which type-checks the statement condition. This then results in attempting to type-check the condition multiple times. Such code would already be invalid if in a local context such as a function. In principe we could defensively guard against these cases throughout the type-checker, but it would be much more straightforward to eliminate them entirely.
Given that these sound init-before-declaration cases are extremely rare, I would much rather favor consistently banning forward references in the type-checker.