Clarify scoping behavior with local functions

Background

Recall that Swift allows mutually-recursive local functions. This lets you write algorithms like this:

func foo() {
  func a(_ x: Int) -> Int {
    if (x % 2 == 0)
      b(x / 2)
    else
      a(x - 1)
  }

  func b(_ x: Int) -> Int {
    if (x % 2 == 0)
      a(x / 2)
    else
      b(x - 1)
  }

  a(100)
}

Note that the body of a references b. Normally, this kind of forward reference is not allowed in local scopes. For example this is an error:

func f() {
  print(x)
  let x = 10
}

Here is another example:

struct S {
  let x = 123

  func f() {
    print(x) // refers to 'S.x' as a member of 'self'

    let x = 321
    print(x * x) // refers to the local 'x' above
  }
}

So clearly forward references are not allowed most of the time, but the rules need to be relaxed somewhat to make local functions work properly. If we want to formalize the current name lookup rules, they look something like this:

  1. First, look for all visible local definitions with the given name that precede the current source location in the buffer. If one was found, we're done.

  2. If there are no local definitions with the given name that precede the current source location, and we're inside a local function or closure scope, look through all local definitions in the parent scope, even if their source location comes after the local function or closure.

  3. If this doesn't resolve the name, look for an outer type definition, and perform a member lookup on self to resolve it as an instance member.

Proposed change

The second rule is needed to make mutually-recursive local functions to work, but it is too general, and it can also find forward references to non-functions as well. As you will see below, this causes some issues.

I'd like to tweak the rules slightly, so that rule 2 now reads:

  1. If there are no local definitions with the given name that precede the current source location, look through all local function definitions in the current scope, even if their source location comes after the use site.

That is, we relax the rule to allow forward references to local functions from the top level of a function body, but we tighten the rule to completely ban forward references to let bindings.

Examples

Valid -> invalid example

For example, Swift 5.3 accepts this:

func f() {
  func local() {
    print(x)
  }

  let x = 123
  local()
}

We would reject this under the new rule, because the body of local cannot see x under the new rule 2.

Remains valid but has behavior change example

Here is an example where Swift 5.3 accepts the code, but the behavior would change under the new rule:

struct S {
  let x = 123

  func f() {
    func local() {
      print(x)
    }

    let x = 321
  }
}

In Swift 5.3, the above prints 321 because when we do the lookup into the parent scope of local(), we find the local let x (rule 2) before we find the let x as a type member (rule 3).

Under the new rule, this would print 123, because lookup would only find the first let x that precedes the use location.

Remains valid, no behavior change example

Here is an example where the behavior remains the same in Swift 5.3 and the new proposed rule:

func outer() {
  let x = 123

  func middle() {
    func local() {
      print(x)
    }

    let x = 321
  }
}

Here, the outermost let x is local, so rule 1 kicks in, so in print(x) we find the let x defined in outer(), and not the let x defined in middle(), so this code prints 123 and not 321.

Invalid -> valid example

Here is an example that Swift 5.3 rejects, and the new rule would accept:

func f() {
  first()

  func first() {}
}

Swift 5.3 rejects the forward reference because the old rule 2 only applies when you're already inside a local function or closure scope. The new rule 2 allows it to be found, however.

Invalid -> valid example

Swift 5.3 also rejects this example:

struct S {
  let x = 123
  func outer() {
    let closure = { print(x) }
    let x = 321
  }
}

In the above code, print(x) appears inside a closure body, so rule 2 resolves the x to the local binding let x inside outer(). This code type checks! However, SIL capture analysis rejects it, because the closure attempts to capture x before its definition.

Under the new rule, the above code will both type check and pass SIL diagnostics correctly, because the closure references self.x, not the local let x.

Aside about capture diagnostics

Even with the new rule though, we still need to do the SIL capture analysis. For example, you can forward reference a function, which references a capture not yet in scope:

func f() {
  func a() {
    b() // valid -- rule 2
  }

  let x = 123

  func b() {
    print(x) // valid -- rule 1
  }

  b() // valid -- rule 1
}

Motivation

Today, we have two name lookup implementations in the compiler. Parse-time lookup resolves names that have already been parsed (precede the use location in the input source). Parse-time lookup only knows about local bindings, not members of types or top-level declarations.

Then later, the expression type checker performs a "pre-checking" pass to resolve any remaining names. This resolves members of types, but also this is where rule 2 allowing forward references from inside function bodies is implemented.

I'm currently beefing up the name lookup used by pre-checking ("ASTScope") to the point where it can replace parse-time lookup. While it would be possible to faithfully model the current behavior, it would make a lot more sense both from the standpoint of language semantics and implementation simplicity to tweak the rules as I suggested at the beginning of this pitch.

While the list of examples that now behave differently above looks a bit scary, I suspect in practice nobody is really relying on these edge cases, and mutually recursive local functions themselves are rather rare (but probably not rare enough that we can consider removing them. I'd much rather tighten up the semantics to make them more reasonable).

I ran source compatibility testing and didn't find any code that relies on the old behavior. The validation test suite has some examples, all reduced from crashers (the SIL capture analysis' added pretty recently, and we used to crash in the cases it now rejects).

Does anyone have thoughts on this?

16 Likes

One more thing: @Joe_Groff pointed out offline that forward references to local types are supported too, and they can be mutually recursive:

func f() {
  print(LocalStruct())

  struct LocalStruct {
    var x: LocalClass? = nil
  }

  class LocalClass {
    var x = LocalStruct()
  }
} 

This would not change with my proposed update to the name lookup rules.

In fact, local types are already prohibited from capturing values from the outer scope, so the behavior of name lookup with local type declarations is not going to change under my proposal at all.

So in summary, rule 2 in my post should actually say "local function and type definitions" not "local function definitions".

6 Likes

Yeah, I think the more general distinction here is between non-immediate declarations (functions, types, and such), and immediate statements and local variable declarations. I think the change you suggest makes sense, source compatibility willing.

6 Likes

This makes sense to me too. My only worry is the source compatibility issue regarding examples that remain valid but have a behavior change. This may or may not be picked up on source compatibility testing.

Ideally, the compiler would take a similar strategy to what we're now using for staging in the "forward scan" rule for closure matching: try both the old and new rule, then warn when those give different results, preserving the old behavior for now.

This kind of blows up the "implementation simplicity" angle here for the rewrite, but it would still move us forward in terms of language semantics without silently altering the behavior of code that compiles.

5 Likes

There's a case with behaviour change? I could only think of valid/invalid toggling.

Yes. You need a very specific setup:

struct S {
  let x = 123

  func f() {
    func local() {
      print(x)
    }

    let x = 321
  }
}

The lookup of x from inside local() finds the local binding let x today, by rule 2. With my proposed change, we find self.x via rule 3. So the printed output changes from 321 to 123.

To my eye, the behavior change example demonstrates that we ought to disallow some cases we allow today. Consider the behavior change example, only with many many lines of code thrown in between the lets and the uses. Could lead to a very-hard-to-find bug. My preference would be for that case to be illegal.

On second thought, maybe shadowing warnings would take care of this problem.

1 Like

Interestingly, I've encountered some of these edge cases quite a few times, and the ability of using mutually recursive functions is very important to one of my projects. Some of my earliest discussions on the forums are about these forward references (without knowing the term).

I like the new rule 2 proposed here. It makes reasoning source code more rational and straightforward. Although the change is source breaking, I think the old rule 2 can be considered as active harm.


The only bit I'm not completely comfortable with is this:

In my opinion, a user would expect local() to be aware of x, because the local() is called after x is declared.

I wonder if either of the following solutions might keep it valid:

  1. Add a rule 1.5 or 2.5 in addition to modifying rule 2:

    Rule 1.5 or 2.5: If there are no local definitions with the given name that precede the current source location, and we're inside a local function or closure scope, look through all local definitions of the given name preceding the function use site in the current scope.

  2. If name lookup fails, have the compiler try to move the function's declaration to immediately preceding the its invocation, without changing either the declaration or invocation's scopes, then try the lookup again.

func f() {
  func local() {
    print(x)
  }

  let x = 123
  local()
}

How could this be accepted by the compiler? That seems like a very clear bug to me, how come it works?

Today, we only consider a let binding out of scope if it's part of the same function body. Let bindings from outer function bodies are always in scope. This is the rule I'm proposing we change; I definitely agree it's confusing, but it's not a bug, the behavior is intentionally implemented and tested this way.

Yeah it seems very unintuitive to close over something that's not even defined by the time you close over it. The reason it's confusing I guess is that function order is relevant, but I suppose it is only relevant as long as the functions are defined in order?

Btw, this will print 2, is it because y is sort of semantically captured by reference, or because f is not actually declared until right before it's being used?

    func f() {
        var y = 1
        func local() {
            print(y)
        }

        y = 2
        local()

PS This also works actually, so it's not just let:

    func f() {
        func local() {
            print(x)
        }

        var x = 1
        local()
    }

FWIW, implicit captures always happen by reference, so the following:

func f() {
    var y = 1
    func local() {
        print(y)
    }

    local()
    y = 2
    local()
}

prints

1
2

However, there is also analysis that happens based on the actual usage of local, which is what allows the compiler to correctly reject this:

func f() {
    func local() { // error: closure captures 'y' before it is declared
        print(y)
    }

    local()
    let y = 123
}

I really don't like this one, because it makes it harder to understand the behavior of something like

func foo() {
  first()
  let x = 1
  func first() {
    print(x)
  }
}

This will still be rejected due to the capture happening before x is declared, but the chain of events that led to that is no longer clear. first is declared after x, right? So the diagnostic has to point to the use site somehow as well: "this is why we're treating first as being declared before x even though you wrote it after".

There's a simpler version of rule 2 that solves this problem:

  1. If there are no local definitions with the given name that precede the current source location, look through subsequent elements of the parent scope (not current scope) until (a) you find a function or type with a matching name, or (b) you find a non-function, non-type block element.

However, we've seen people get really confused with the similar rule implemented for top-level code, so this might be a chance to improve it:

  1. If there are no local definitions with the given name that precede the current source location, look through subsequent elements of the parent scope until you find a function or type with a matching name. However, if the declaration you find follows a non-function, non-type block element, you get a warning (upgraded to an error if we ever do another language break.

This variation does diagnose the example included in "Aside about capture diagnostics", and that's reminding me how bad it is to break existing code or (worse) silently change the behavior of existing code. I tend to agree with @xwu that "do it both ways" is a much better thing to do for now, especially since it sounds like the old ad-hoc way can actually be modeled pretty well with ASTScopes.

1 Like

Another issue with that example is that it wouldn't compile if you replaced the direct call to first() with a func definition where you call first() inside:

func f() {
  func g() {
    first()
  }

  func first() {}
}

At least to me it's confusing that position in the code is only relevant sometimes, and not always. I suppose the tension is there because, unlike in global or class scope, we want to be able to capture things, so we can't make the location of a function definition arbitrary when in local scope, but I still feel we should make it as consistent as possible.

No, the behavior of this example would not change. You will still be able to reference first() from the body of g(), just as you can today in Swift 5.3.

Hmm. If forward references to local bindings were banned, and if forward references to functions with bindings "in between" were banned, we wouldn't need the SIL diagnostics pass for captures.

I actually came up with an idea for getting the old lookup behavior simulated in ASTScopes, so if that works out, then re-considering language semantics is not in the critical path anymore. But it's still interesting to think about.

One question: you seem to be implying that forward references to types should not be allowed inside of a function scope either. Why is that? Since local types can't capture anything, it doesn't really case any problems that I can think of.

Amusingly, Swift 5.3 accepts this:

func f() {
  _ = S()

  struct S {}
}

But as I wrote above, not this:

func f() {
  g()

  func g() {}
}

Ah, sorry about that. I presumed it didn't work, so I had never tried it!

I don't think forward references should be allowed in expression contexts, period, because it makes code harder to read when reading a function top to bottom. Mutually recursive functions and types provide a clear case where that's not sufficient, but I think they're the only cases, and therefore I'd like to ban all the others because it's more consistent for humans and compilers alike.

Clearly not everyone agrees with me on the ordering thing: Haskell where clauses encourage the "here's the top-level logic, and here's the helper" ordering:

initials :: String -> String -> String  
initials firstname lastname = [f] ++ ". " ++ [l] ++ "."  
    where (f:_) = firstname  
          (l:_) = lastname    

But it's one part of Haskell I've always had trouble with.

1 Like