Closures, Methods & Reference Cycles

abeldemoz · August 22, 2022, 9:20pm

I have a few questions about methods and closures in their relation to each other and memory leaks as I'm not sure why closures lead to memory leaks when methods don't. My assumption was that if a method doesn't create a reference cycle, then assigning that method to an instance property wouldn't lead to a reference cycle. Similarly, I assumed that any method would create a reference cycle if it accesses and returns a closure that creates a reference cycle. It turns out that neither of these are the case, as is demonstrated in the code I wrote, but I'm not sure why.

I’ve written three separate do blocks to create three different scopes.

In the first do block, I’ve created an instance of Car and accessed its someClosure property. The print statement in the init method gets executed but the print statement in the deinit method doesn’t, hence a memory leak occurs.

In the second do block, I’ve created an instance of Car and accessed its otherMethod method. Both of the print statements in the init and deinit methods get executed, thus indicating no memory leaks.

The final do block more or less exhibits the same thing as the first do block.

I have three questions:

If someMethod accesses someConstant via self, why doesn’t it cause a memory leak?
if someMethod doesn’t cause a memory leak, why does someClosure create a memory leak when it’s equal to someMethod?
if someClosure creates a memory leak and otherMethod accesses and returns someClosure, why doesn’t otherMethod create a memory leak?

class Car {
    init() { print("Car is being initialised") }
    
    let someConstant = 1
    
    func someMethod() -> Int {
        someConstant
    }

    lazy var someClosure = someMethod

    func otherMethod() -> () -> Int {
        someClosure
    }

    lazy var otherClosure = otherMethod
    
    deinit { print("Car is being deinitialised") }
}

print("---------- block 1 ----------")

do {
    let car = Car()
    print(type(of: car.someClosure))
}

print("---------- block 2 ----------")

do {
    let car = Car()
    print(type(of: car.otherMethod))
}

print("---------- block 3 ----------")

do {
    let car = Car()
    print(type(of: car.otherClosure))
}

// The following gets printed:

// ---------- block 1 ----------
// Car is being initialised
// () -> Int
// ---------- block 2 ----------
// Car is being initialised
// () -> () -> Int
// Car is being deinitialised
// ---------- block 3 ----------
// Car is being initialised
// () -> () -> Int

AlexanderM · August 22, 2022, 9:33pm

Please post your code as ... code. This lets interested parties run your code and play around with it, without you needing to ask them to waste their time transcribing it by hand.

abeldemoz · August 22, 2022, 9:35pm

Noted. My apologies, this is my first post, so I'm still learning the ropes.

AlexanderM · August 22, 2022, 9:36pm

No worries, you can just edit it right in. You can remove the image entirely. Just post a second code block with the print output.

I'd also suggest you remove the English explanation of every intricacy of your code. Your code is right in front of us, you don't need to repeat yourself in Psuedo-Swift English :) Focus readers' attention on your question, your assumptions for how/why it should work, and how the real behaviour differs from that.

abeldemoz · August 22, 2022, 9:51pm

Thanks, I've made the edits to my post.

Lantua · August 22, 2022, 10:37pm

The nuance here is that when you access someClosure, you're implicitly creating a closure { [self] in self.someMethod() } then assign it to self.someClosure. Consider a smaller example

class SmallCar {
    init() { print("Init") }
    deinit { print("Deinit") }
    var closure = {} 
}

do { // Case 1
    // Init
    let car = SmallCar(), x = { print(car) }
    // Deinit
}
do { // Case 2
    // Init
    let car = SmallCar(), x = { print(car) }
    car.closure = x
    // NO deinit
}

Here, in the first case, you simply create a closure { print(car) } which captures car. There's no cycle here. For the second case, you create { print(car) }, which captures car, and assign it to car.closure. Now you have car that contains a closure { print(car) }, and the closure itself captures car, causing the reference cycle.

If you look back to your original code;

The first case access car.someClosure causing it to create a closure { [self] in self.someMethod() } and assign it to someClosure (this is case 2 in our small example).
The second case you only retrieve self.otherMethod and not invoking it, so it doesn't access self.someClosure.
The third case you access car.otherClosure, causing it to be initialized to { [self] in self.otherMethod() }, causing a reference cycle similar to your case 1.

abeldemoz · August 22, 2022, 11:44pm

If what you’re saying is true, wouldn’t that mean that the type of someClosure is () -> () -> Int instead of the () -> Int that was printed in my first case?

Thanks for this. I just tried retrieving and invoking otherMethod in two separate do blocks. The former didn't create a reference cycle, but the latter did. Correct me if I'm wrong, but the difference between retrieving and invoking a function is that the former gets its address in memory whilst the latter executes its body?

Lantua · August 23, 2022, 12:04am

lazy var is still a var, so accessing it doesn't add an extra function call there. In effect, lazy var a = 3 is an Int, not () => Int because whenever you access it, you simply do x.a and not x.a().

By the same token, the closure you have is self.someMethod, i.e., { [self] in return self.someMethod() }, which is () => Int. So someClosure is also itself an () => Int.

"Getting address in memory" isn't quite the right process, esp. for functions that have captured variables. That said, you're right that simply retrieving and passing around a closure doesn't execute its body. You wouldn't want a function to accidentally be invoked if you just want to pass it to another variable (let a = oldClosure).

Keep in mind that even though getting a closure doesn't itself invoke the closure, the act itself can have other effects, such as initializing a lazy variable (as it does in your original code), invoking other functions (if you are accessing via computed property), etc.

abeldemoz · August 23, 2022, 8:43am

Thanks, you've helped clear up a lot of stuff for me. I just have two more questions:

If we're not getting a method's address in memory, what are we doing when we retrieve a method?
In my example, someMethod returns self.someConstant, but it doesn’t create a memory leak despite accessing self. Similar to someClosure, the type of someMethod is also () -> Int. Why doesn’t someMethod create a memory leak when it returns self.someConstant? In other words, what’s different about methods and closures such that methods don’t leak memory but closures do?

Lantua · August 23, 2022, 3:03pm

I think it'd be more accurate to say that getting an address in memory isn't the only thing you do when retrieving a method. Say, we create a closure with a capture variable capturedVar;

for capturedVar in 0..<100 {
  let closure: () -> () = { [capturedVar] in ... }
}

the closure may behave differently for each value of capturedVar. They're effectively 100 different functions. However, you'd hardly want to create 100 different functions for each possibility of capturedVar (and sometimes the number of possibilities aren't even known). So what the compiler does instead is to create a single non-capturing function, together with storing the captured variable into a single object. Essentially, it compiles closure down to

// Not actual Swift Code
struct Closure1 {
  var capturedVar: Int
  var functionPointer: FunctionPointer<(capturedVar: Int) -> ()>
}
for capturedVar in 0..<100 {
  let closure: Closure1 = ...
}

this way, the compiler can reuse the same functionPointer over and over regardless of the value of capturedVar. This way of augmenting a (function) pointer with an additional runtime information is usually referred to as "fat pointer". It is prevalent in other languages as well. This also explains why a closure retains its captured variable (and may cause reference cycles).

The trick here is to realize that class method is a closure that captures self. You can replace all someMethod in your code with { [car] in car.someConstant } and otherMethod with { [car] in car.someClosure } or { [self] in self.someClosure }, and it'll behave much the same.

class Car {
    init() { print("Car is being initialised") }
    
    let someConstant = 1
    
    lazy var someClosure = { [self] in self.someConstant } // someMethod
    lazy var otherClosure = { [self] in self.someClosure } // otherMethod
    deinit { print("Car is being deinitialised") }
}

print("---------- block 1 ----------")

do {
    let car = Car()
    print(type(of: car.someClosure))
}

print("---------- block 2 ----------")

do {
    let car = Car()
    print(type(of: { [car] in car.someClosure })) // otherMethod
}

print("---------- block 3 ----------")

do {
    let car = Car()
    print(type(of: car.otherClosure))
}

// The following gets printed:

// ---------- block 1 ----------
// Car is being initialised
// () -> Int
// ---------- block 2 ----------
// Car is being initialised
// () -> () -> Int
// Car is being deinitialised
// ---------- block 3 ----------
// Car is being initialised
// () -> () -> Int

So when you're using car.someMethod, you're only creating a closure and invoke it. This is similar to my small example case 1, where you only create closure x. If you draw the reference graph, you'll see no cycle. Now when you use car.someClosure, you are creating the "same closure", but now also assign it to car.someClosure, which is what causes the reference cycle (similar to my small example case 2).

PS

Class and struct methods in Swift are much closer to global curry functions;

Car.someMethod = { car in  return { return car.constant } }
car.someMethod() // a.k.a. Car.someMethod(car)()

but class methods also have nuances about dynamic dispatch as well. So neither model would accurately capture all the intricacies about class method. However, for the purpose of analyzing the reference graph, both models work just fine, and I think "method is a closure that capture self" is easier to work with.

rayx · August 23, 2022, 3:03pm

(incorrect explanation)

[quote="abeldemoz, post:9, topic:59821"] In other words, what’s different about methods and closures such that methods don’t leak memory but closures do? [/quote]

Below is how I understand it.

First, note that Car.someMethod() is an unbound method. It takes an implicit argument.

print(type(of: Car.someClosure))
// Car -> () -> Int

car.someMethod() is a bound method. It receives the instance as the implicit argument.

print(type(of: car.someClosure))
// () -> Int

Also note that In the following line of your code, someMethod is effectively self.someMethod (that is, a bound method).

lazy var someClosure = someMethod

If you call self.someMethod(), the method holds reference to self through its implicit param. Since it's a param, the reference is saved in stack. Once the call completes, the stack is deallocated and the reference is gone.

(EDIT: the above paragraph is incorrect, because when one references a bound method, the compiler automatically generates one, hence there is no "reference in stack" as I described. As noted by Jordan, bound method always captures self.)

In contrast, if you save self.someMethod somewhere (as what you do in your code), the compiler has to make sure the bound method can access the object when it's invoked in future, so the bound method captures[1] the object and hence keeps the reference. That causes reference cycle.

~~So, closure always captures an instance if you access the instance's properties in its body, but bound method may or may not capture self, depending on how you use it. That explains the difference.~~

[1] See: Swift Regret: Bound Methods // -dealloc.

jrose · August 23, 2022, 5:07pm

A bound method always captures self; that’s the “bound” part (past participle of “bind”). Car.someMethod would be an “unbound method” (edit: as you note), not that it comes up too much.