Value wrapped in a thunk recursively(?)

Hello guys!

I'm trying to understand why the code below works not quite as I expect it to work.
What i've tried to achieve is to store weak references to objects with the help of closures. In real world these were delegates and i've filtered out any deallocated instances on every delegate call.
You can see what's happening by putting breakpoint on "print("test")" line.

@main
struct CLI {
    static func main() {
        var closures: [() -> MyClass?] = []
        var strongRefs: [MyClass] = []
        var done = false
        for _ in 0..<5 {
            let instance = MyClass()
            strongRefs.append(instance)
            closures.append({ [weak instance] in
                if done {
                    print("test")
                }
                return instance
            })
        }

        for _ in 0..<10_000 {
            var temp: [() -> MyClass?] = []
            for getter in closures {
                if getter() != nil {
                    temp.append(getter)
                }
            }
            closures = temp
        }
        done = true
        let _ = closures.first?()
    }

    private final class MyClass {}
}

Stack trace is 20k lines long. All like that:

#19998	0x0000000100002fe0 in thunk for @escaping @callee_guaranteed () -> (@owned CLI.MyClass?) ()
#19999	0x0000000100002f18 in thunk for @escaping @callee_guaranteed () -> (@out CLI.MyClass?) ()
#20000	0x0000000100002fe0 in thunk for @escaping @callee_guaranteed () -> (@owned CLI.MyClass?) ()
#20001	0x0000000100002f18 in thunk for @escaping @callee_guaranteed () -> (@out CLI.MyClass?) ()

Btw, compiling on xcode 15 in release mode stack trace was 30k lines. And since it happened a while ago, now i'm trying to compile in release mode on xcode 16 and not even getting correct result. E.g. "test" is never printed and closures.first?() is nil at the end.

Can someone please explain what's happening here? And maybe provide some resources to read if this is expected behaviour.
I guess it's something to do with a temp array, because closures = closures.filter ... worked as expected (at least with xcode 15).

This is expected behavior given the implementation. Consider this function:

func concreteFn(_ fn: (Int) -> Int, _ value: Int) -> Int {
  return fn(value)
}

The function takes a closure value with type (Int) -> Int and a value of type Int, and then calls the closure with the value. The closure receives the value and returns its result in a register.

Now if you have this:

func genericFn<T>(_ fn: (T) -> T, _ value: T) -> T {
  return fn(value)
}

We don't know the size of T or anything else about it, so here we must pass the value of type T to the closure indirectly in memory. The closure receives a pointer to the value, and a pointer to a location large enough to store the result.

So the basic idea here is that a concretely-typed closure (Int) -> Int has a different representation than a closure whose type becomes (Int) -> Int after substitution. This means that when you do something like this:

let closure: (Int) -> Int = { $0 + 1 }
genericFn(closure)

We have to wrap the closure in a reabstraction thunk. A thunk is another closure that just passes the arguments to the original closure and then returns the result, but it shuffles values between memory and registers to make the calling convention line up. This requires a heap allocation, and introduces an extra stack frame when you call the closure.

So in your example, the closures stored in the Array return MyClass? indirectly, but the value of getter in for getter in closures has the type () -> MyClass?, so it has to return MyClass? in a register instead. So we must wrap the getter in a thunk after reading it from the array. Similarly, when you store the getter back to the temp array, it has to be wrapped in a complementary thunk to convert the calling convention back the most general form again.

After 10,000 iterations of the for loop, every closure is actually a giant linked list of thunks that convert the calling convention back and forth until finally reaching the tail of the linked list, which is the original closure.

Can we do better? Yes, there are two possible improvements:

  1. In your example, we could cancel out the complementary thunks with static analysis, so that simply storing and loading getter gives you the original getter instead of thunk(thunk(getter)).

  2. Static analysis won't help in some cases, and we might be able to unwrap thunks at runtime too, but this would add additional overhead.

The filter version doesn't have this problem, because it does not change the existing elements in the array.

It might be that strongRefs is getting optimized away entirely, since it's a local variable that is never read from, only written to.

9 Likes

Slava, hello!

Thank you very much for your explanation!

I've managed to get the same behavior with a simple generic Box struct, but just to clarify - did i understand correctly that in my example we have to wrap closure because Array is generic over it's Elements?

1 Like

Yep, any generic container, like Box<T> where T is a function type will have the same behavior:

struct Box<T> {
  var t: T
}

The one exception is Optional. Since Optionals of function types are so common, we special case the representation to avoid reabstraction thunk overhead in this case. (This is the one “non-trivial” behavior of Optional that cannot be replicated outside of the standard library; the rest is just syntax sugar.)

3 Likes

One way to prevent thunking when passing through generic interfaces is to put the function inside of a struct, like:

struct Operation {
  var operation: () -> MyClass?
}

Since generic substitution can't affect the type of the operation field inside of an Operation struct, an Operation can pass in and out of generic contexts without a representation change.

6 Likes