Long-term solution for accidental retain cycles from strong references in closures

moreindirection · January 14, 2025, 6:49pm

In general, I'm a huge fan of Swift. I really appreciate its goal of being a safe language. Swift does its best to make sure programs that pass the compiler won't unsafely access memory, have data races or crash from null pointer exceptions.

There's one exception to this: capturing self in closures. It's very easy, when writing code quickly, to leave out [weak self] at the top of the closure. If you do make this mistake, your code will compile perfectly, and you'll never find out about your mistake - unless the Leaks profiler hits that bit of code, or your users are running the app long enough that it causes a memory leak.

Every time I find a memory leak and it's just because I forgot to weakly capture self, I shake my head. It's a silly mistake and there's no need for this to happen, especially in a language that's so safe in all other ways.

I'd like to see Swift improve on this in future versions. Are there any proposals being discussed that might improve this?

The reality for me (and probably for many other developers) is that strongly capturing self is wrong most of the time. I almost always want a weak reference to self in my closures. I know it's too late for this, but I almost wish weak captures were the default, and I had to specify only when I need a strong reference.

I'm not a language designer, so I don't have any concrete suggestions, but I would love to see more tools for dealing with this in Swift.

David_Smith · January 14, 2025, 6:54pm

Unfortunately this isn't viable, empirically. When Swift's optimizer improved a few years ago, we discovered that almost every existing codebase incorrectly over-used weak references and would fail if their actual lifetime rules were enforced. The entire language's lifetime system had to change to compensate. The failures typically looked like a closure that was expected to do X doing nothing instead because the reference it expected to use had disappeared.

Morten_Bek_Ditlevsen · January 14, 2025, 7:25pm

I’m not certain what the solution could entail, but for me too, retain cycles due to self-capturing closures (that are in turn being kept alive by self) is one of the hardest problems in Swift.

Our app receives realtime updates to values from a remote database.
This is currently modeled with subscriptions in RxSwift, but the issue is of course also present with Combine - or even with iteration over async streams using for await.

One thing that improves our situation is using the Observable macro together with view models for any ‘signal’ that only updates UI.

But not everything fits into that bucket and our app will continue to have a lot of code where we need to react to signals of changing values, which precisely has the shape where it’s way to introduce retain cycles.

My best hope is some kind of static analysis that could spot the cycles in the code, but I don’t know if such a thing could be possible.

Joe_Groff · January 14, 2025, 9:46pm

Weak references are if anything overused as a quick fix for reference cycles, and each weak reference brings with it added complexity, since your program logic now needs to account for the nil possibility, and that also tends to be handled in "quick fix" ways like guard let x else { return } that are often but not always adequate. It is actively harmful to use weak references in places where they aren't needed, so it would be a bad idea to ever introduce them by default.

I think it would be good to consider practices that reduce the likelihood of cycles becoming permanent memory leaks that don't require the use of weak references. While they will still have their place, in many cases there are more robust alternatives:

Explicitly maintaining the lifetime of callback closures can prevent these closures from producing permanent reference cycles. For instance, the implementation of APIs that take single-shot callbacks can explicitly discard those callbacks after they've been used, by resetting the closure to nil or an empty closure. Even for APIs with multi-shot callbacks, there is usually in practice some explicit event that ends the need for the callback (operation being canceled, UI element being hidden or closed, etc.), and callback closures can be released explicitly in response to this event rather than rely on object death to eventually clean them up.
On the closure's side, it is often possible to capture only the parts of self that are needed rather than self in its entirety in order to avoid forming a cycle. If the closure needs access to the value of some immutable fields, it can capture those fields directly. If it needs to share some state with self, that state could be placed in a separate object that doesn't also own the closure.

It isn't always possible to do these things instead, and they definitely require more effort than slapping [weak self] on a closure, but not relying on weak references can lead to overall more robust code.

ksluder · January 14, 2025, 10:09pm

There are practical issues with implementing these alternatives.

This requires atomically tracking the state of whether the closure has fired. This is difficult to implement correctly/efficiently in Swift, because simply nilling out the closure variable once it starts executing is not enough. First of all, assignment in Swift is not atomic. But even if it were, I don’t believe the language guarantees a closure lives while it executes, so simply nilling out the storage holding the closure might free its captures. You would need to atomically swap it into strong storage that lives until the end of the closure’s execution:

class Foo {
  var callback: ((Foo) -> Void)?
  private func notifyCallback() {
    var cb: ((Foo) ->)? = nil
    atomicallySwap(&cb, &callback)
    guard cb else { return }
    cb()
  }
}

@David_Smith brought up the case where the callback doesn’t fire as expected, but this solution presents the opposite problem where most of the callback’s work is useless if nobody else has a strong reference to self. If the closure strongly captures some of self’s ivars, these pieces of the object graph can malfunction when the thing that was holding them together goes away. For example, if I capture a Subject owned by self, I can still push values through it. Now listeners are getting signaled by a half-deconstructed object subgraph that may no longer be obeying its own invariants. It might be possible for the closure to capture a Publisher that indicates whether the upstream source has been invalidated, but at the language level this is only correctly modeled by atomic weak references.

Joe_Groff · January 14, 2025, 10:26pm

ksluder:

This requires atomically tracking the state of whether the closure has fired. This is difficult to implement correctly/efficiently in Swift, because simply nilling out the closure variable once it starts executing is not enough. First of all, assignment in Swift is not atomic. But even if it were, I don’t believe the language guarantees a closure lives while it executes, so simply nilling out the storage holding the closure might free its captures. You would need to atomically swap it into strong storage that lives until the end of the closure’s execution:
class Foo {
  var callback: ((Foo) -> Void)?
  private func notifyCallback() {
    var cb: ((Foo) ->)? = nil
    atomicallySwap(&cb, &callback)
    guard cb else { return }
    cb()
  }
}

An escaped Swift closure receives its context by borrow, so the context will be kept alive by the closure's execution. If the callback can be triggered from multiple locations, then yeah, you'd have to synchronize access while niling it out (which Mutex should make a lot more straightforward now).

ksluder:

@David_Smith brought up the case where the callback doesn’t fire as expected, but this solution presents the opposite problem where most of the callback’s work is useless if nobody else has a strong reference to self. If the closure strongly captures some of self’s ivars, these pieces of the object graph can malfunction when the thing that was holding them together goes away. For example, if I capture a Subject owned by self, I can still push values through it. Now listeners are getting signaled by a half-deconstructed object subgraph that may no longer be obeying its own invariants. It might be possible for the closure to capture a Publisher that indicates whether the upstream source has been invalidated, but at the language level this is only correctly modeled by atomic weak references.

Yeah, this sort of thing is also a common problem in reactive frameworks in GC languages, which is one reason they often encourage developers to overlay deterministic disposal mechanisms over their usage so that the observations are torn down systematically rather than being left to the whims of the GC to slow decay away. Keying the teardown of a callback chain on an explicit event rather than relying on object lifetimes still seems like it could help avoid this sort of situation.

ksluder · January 14, 2025, 10:28pm

It still seems strange to explicitly discourage relying on something implemented atomically in the language in favor of manually implementing an atomic signal yourself.

JuneBash · January 14, 2025, 11:02pm

I actually pretty strongly disagree with the notion that using weak self by default is bad. In my work, I have rarely come across problems with overusing weak self, but frequently come across problems where self should have been made weak. YMMV, but I feel pretty strongly on this one. Mostly it's something like this:

class TableVC: UITableViewController {
  lazy var dataSource = UITableViewDiffableDataSource<SectionID, ItemID>(
    tableView: tableView
  ) { tableView, indexPath, itemID in 
    let stuff = self.stuff(for: itemID) // <- retain cycle!
    return tableView.dequeue//blahblahblah
  }
}

Adding a [weak self]/guard let self here is a pretty easy fix to what could potentially become a pretty gnarly memory leak in a long-running application like I work on.

On the opposite side of things, at worst, I've seen something like this:

server.stuff.getThings { [weak self] result in 
  guard let self else { return }
  do {
    self.doThings(with: try result.get())
  } catch {
    self.showBanner(for: error)
  }
}

Usually the only way self would be nil here is if the user cancelled out of the screen where this is. So either they decided they didn't want to doThings and that doesn't happen (...good?), or at worst they don't see that an error occurred when they were trying to fetch stuff. No biggie either way.

moreindirection · January 14, 2025, 11:09pm

I totally agree. I can't really think of a time I've had a problem from using a weak reference in a closure, but I've definitely had many annoying retain cycles over the decade or so I've been using Swift.

Joe_Groff · January 14, 2025, 11:28pm

The only atomically safe thing the weak reference gives you "for free" is the transition to nil when the object gets destroyed, so if the callback can be triggered from multiple threads, then it seems like you would already need synchronization for the assignment operation that binds the callback, and if it's really supposed to be a one-shot callback, the one-shot state would need to be tracked atomically as well.

To be clear, I didn't say "never use weak references". If it is correct that the callback operation should not keep the object alive, and to abandon the work when the object isn't alive, then weak references will get the job done, and the Kits definitely have lots of situations where that is appropriate. I do want to push back against the idea that it's the only correct thing to do, and that developers should understand what it means to add a weak reference rather than reflexively do so every time they have a memory leak problem.

David_Smith · January 14, 2025, 11:56pm

It's worth noting that in the situation I mentioned, the misbehavior was latent until there was a compiler change that exposed it. So the developers of the apps in question never saw the problems and never realized their code was incorrect.

sveinhal · January 16, 2025, 1:25pm

What are your most common use cases? My experience is opposite of yours, and I'm glad that strong capture is the default. In my experience most closures are short-lived and/or fire-once closures. This applies to async operation completion handlers, animation blocks, transformers, alert button handlers, queue hopping, etc. In all of those cases the closure is stored briefly and then released when done.

I guess Combine has made long-lived multi-invocation closures more prominent, but my experience is still that those are the minority of closures.

And more importantly: In those cases, I want to clearly think about life cycle management anyways, and explicitly deal with nilled references.

moreindirection · January 17, 2025, 2:58pm

Well, I should constrain my statement a bit: I think weak capture should be the default only for @escaping closures. Obviously, for a non-escaping closure like strings.map { $0.length }, that should be a strong capture and won't cause any problems because the closure doesn't hang around.

But there are lots of examples of closures that are stored and could cause retain cycles: many kinds of event handlers in UIKit or SwiftUI; NotificationCenter blocks; etc.

spacecrafter3d · January 18, 2025, 1:46am

The ease of accidentally creating retain cycles has similarly been my #1 frustration with Swift, as I also try to avoid unnecessary weak self. A specific case where it bit me was when trying to implement RAII-like resource management (my resources wouldn't get cleaned up because of an accidental retain cycle). I posted about it here and learned about noncopyable types for which the compiler makes you be explicit about memory management. This was really exciting to me. I'm sure it won't solve every case, but in some cases, you don't need to rely on ARC if you make your types noncopyable and are explicit about ownership. It led me to start learning Rust so I can better understand that memory management model.

sveinhal · January 20, 2025, 11:22am

All of my listed exampled are @escaping closures. E.g. a button handler attached to an alert. It needs to escaping, because it needs to be stored until the user chooses "ok" or "cancel". But since the alert is app-modal, and both closures are released the second the user chooses one of them, the closure leaks nothing.

Same goes for animation blocks. It need to escape to be put into the next render cycle or whatever, but it is released momentarily. Moving stuff onto a background thread, same thing. Completion handlers must also be @escaping, because they must be stored until the operation completes. But it leaks nothing, given that the operation will complete sometime.

The only cases where these may leak, is when they are both @escaping, stored off, and kept around indefinitely (or for the entire life duration of the holder).

These cases certainly exist, but in my experience these cases are in the minority, not the majority.

ktraunmueller · January 20, 2025, 11:42am

I think closures should not affect the lifetime of the captured objects (at least when we are talking about button handler or animation completion closures).

Because of this, I write weak on any captured reference by default.

Whenever a strongly captured reference is needed to make the code work correctly (in cases like this), I think something is wrong with the ownership structure of the code.