On the behavior of variables in top-level code

The variables in top-level code behave weirdly. They are declared in the global scope, but are initialized serially, like local variables in a normal function. This allows some interesting pieces of code to compile, such as:

print(x)
var x = 32

This program will print 0. The basic value types are automatically initialized, so they have a value. However, if x is a class, this results in a crash. Clearly, this is not an ideal situation. You shouldn't be able to use variables before they are initialized, let alone before they are declared.

My thoughts are to make the top-level variables behave like local variables inside of an implicit function.
Before prescribing a change in the form of a pitch, I have a few questions that I would like to ask the forum.

  • Are you relying on variables in top-level code behaving as global variables? (How and why?)

  • In wrapping the top-level code in an implicit main function, are any functions declared in that space nested inside of this main function, or are they global?

The following code would implicitly behave like the following:

var x = 32
func foo() {
   print("Hello World \(x)")
}

In the nested form:

@main struct Main {
  static func main() {
    var x = 32
    func foo() {
       print("Hello World \(x)")
    }
  }
}

Un-nested:

func foo(_ x: Int) {
  print("Hello World \(x)")
}

@main struct Main {
  static func main() {
    var x = 32
    foo(x)
  }
}

Note that in the un-nested form, x is not visible from foo, so you would need to pass it in directly.
This is a departure from how top-level code has behaved previously, but is easier to reason about once concurrency is a factor.

Before prescribing a change, I would like to get your thoughts on the matter and hear if anyone is relying on this behaviour. If you are, how and why?

Given that this is source breaking, changes here would be a change for Swift 6 at the earliest.

I don't have specific thoughts on the global part of things, but whatever solution ends up being selected here needs to account for other kinds of declarations and their differences in behavior between being at file scope vs at function scope.

For example, consider types, where it gets more complicated.

var x: Int = 3
struct S { var y: Int = x } // OK

func f() {
  var w: Int = 3
  struct S {
    var v: Int = w // error
  }
}

There's also the long standing issue where conformances for types declared at function scope do not get looked up correctly.

You've also got extensions, which are not allowed inside functions:

var x: Int = 3
extension Int { var y: Int = x } // OK

func f() {
  var w: Int = 3
  extension Int { // error
    var v: Int = w
  }
}
3 Likes

Yikes! I’d consider this a bug — you should probably report it at bugs.swift.org (after checking for duplicates).

As for how variables in executable top-level code should behave, I’d expect them to be local variables. Functions in executable top-level code should be global as long as they don’t capture any local variables — there’s no reason why they can’t be global and being global is more in line with how struct / enum / class / actor / protocol / typealias / extension declarations work in executable top-level code. We could add a warning for global code that tries to access a function that captures local variables, Swift already recognizes a distinction between functions that capture local variables and functions that don’t capture local variables when it comes to conversions to C function pointers.

Alternatively, we could consider making all variables in top-level executable code lazy, but I imagine that would make it too easy to accidentally create a circular reference. On the other hand, it would avoid the possibility of breaking existing code.

We could also instead consider making variables in executable top-level code global as long as their initialization expressions don’t capture any local variables, like I’m proposing with functions. However, I think this behavior would be unexpected for most programmers and could lead to unexpected problems with many variables being unnecessarily lazy and variables being initialized too early. Functions don’t have this problem since they don’t need to be initialized and the use of local functions is rare anyway.

If we do find that people are doing this, we could treat top-level static var and static let (which are currently illegal) as globals in main.swift, with the usual lazy-initialization behavior expected for globals. This would not allow you to initialize them based on sequential code, but that’s precisely the thing that causes safety problems, so that seems like a feature.

I like this (In reference to Becca's suggestion of using static). It would certainly save us from needed additional keywords.

Pragmatically, I think this will probably be the route we take. On a purely philosophical level, I kind of don't like static because the word doesn't really communicate what it's doing. Would we make the static variables behave like static variables in C and not export them, or would we just have static mean global? Like you say though, getting the free semantic checking is a definite feature.

Alternatively, what do you think about having access modifiers on globals and unmodified variables be local? A public let foo = 32 would mean that foo is an exported global, while private let bar = 42 would be the equivalent of a static variable in C. I'm not entirely sure how important this is because I don't know how often folks import top-level code. The only case I can think of is in the REPL (as an implementation detail) which will probably need a language mode anyway until it can be cleaned up.

Sorry, brain went to static global functions instead of static global variables. I’ve clearly pumpkinized for the night. The keyword makes more sense in the context of static global variables. :slight_smile:

1 Like

I think this is a great idea. Local variables are the semantics we expect here, and this also nicely addresses any concerns about the interaction with concurrency because that behavior is well-defined for local variables.

I'd prefer not to use static here, because static var and static let already have a meaning in local functions, and it's different from "this is visible outside of the function".

This is my preferred solution...

because access control doesn't exist for local declarations at all. By explicitly putting an access control modifier on the var or let you're saying it has non-local visibility and also what visibility it has. I guess what we lose relative to static is that static more strongly implies lazy initialization.

It occurs to me that should decide whether we need to support declaring local types and local functions in top-level code. Local types in top-level code are probably not useful at all, because a fileprivate type can do everything that a local type of a non-generic function can do. Local functions could be useful, if you want to capture local variables in top-level code. However, we'd probably need to burn a keyword on this (local func f()...), and it really doesn't feel like it's that important.

Doug

2 Likes

You can get rational top-level code behavior today by always wrapping it in a do block, which will force all the declarations inside to be treated as purely local declarations. That seems sufficient to me as a way of explicitly forming local declarations in a future design too. Alternatively, since we have to do capture analysis anyway, we could potentially DWIM and make functions behave as local functions when they refer to topl-level local variables of the script.

2 Likes

As others have said, this is definitely a bug. Report it. I just found an example with more bizarre behavior:

print(variable)
let variable = 5
print(variable)

Output:

0
5

A let constant with two different values!

This only applies to top-level code in the main.swift file. Otherwise, top-level and static variables are initialized lazily (on first access).

2 Likes

The do block is a good idea for getting local semantics for functions and such; much better than my local func suggestion.

I'd rather us not go down the capture-analysis route, because it means we would not be able to tell whether a given function is visible to another translation unit without performing type checking on its body.

Doug

1 Like

Another consistent model might be to say that all declarations, not only properties, in a top-level code file are local by default, and use explicit private/internal/public visibility to mark the ones that should actually be treated as global declarations.

Terms of Service

Privacy Policy

Cookie Policy