Deinit not called in certain scenarios with for await loop

somu · April 5, 2023, 6:49am

Hi,

Problem:

There are 2 examples of code, I expected deinit to be called in both the examples:
Example 1 deinit is not being called (unexpected)
Example 2 deinit is being called as expected

Questions:

Why is model not being deallocated in Example 1?

Note: Used a Mac Command line project to test the following

Example 1

import Foundation

@MainActor
class Model {
    
    init() {
        print("init")
        startObservingNotifications()
    }
    
    deinit {
        print("deinit")
    }
    
    private func startObservingNotifications() {
        Task { [weak self] in
            await self?.observeNotifications()
        }
    }
    
    private func observeNotifications() async {
        for await _ in NotificationCenter.default.notifications(named: Notification.Name("hello")) {
        }
    }
}

//Invoking
Task {
    let model = Model()
    try? await Task.sleep(nanoseconds: 2_000_000_000)
    print("ideally should deallocate after this")
}
RunLoop.main.run()

Output for Example 1

init
ideally should deallocate after this

Example 2

import Foundation

@MainActor
class Model {
    
    init() {
        print("init")
        startObservingNotifications()
    }
    
    deinit {
        print("deinit")
    }
    
    private func startObservingNotifications() {
        Task { [weak self] in
            for await _ in NotificationCenter.default.notifications(named: Notification.Name("hello")) {
                self?.doSomething()
            }
        }
    }
    
    private func doSomething() {
        
    }
}

//Invoking
Task {
    let model = Model()
    try? await Task.sleep(nanoseconds: 2_000_000_000)
    print("ideally should deallocate after this")
}
RunLoop.main.run()

Output for Example 2

init
ideally should deallocate after this
deinit

Orup70 · April 5, 2023, 7:55am

My interpretation is that you have a strong reference to self in Example 1:

private func startObservingNotifications() {
        Task { [weak self] in
            await self?.observeNotifications()
        }
    }

Yes, you run a task with a weak reference to self, but when you await on the self?.observeNotifications() function the compiler needs to bind a strong reference to self before calling the function. And since the function is looping forever waiting on notifications, it doesn't return and the strong reference is kept alive.

In Example 2 you await in the loop itself and the weak reference to self is not bound to a strong reference until a notification is received. And because of that, the deinit will be called when the (now) last reference to the model is released.

private func startObservingNotifications() {
        Task { [weak self] in
            for await _ in NotificationCenter.default.notifications(named: Notification.Name("hello")) {
                self?.doSomething()
            }
        }
    }

somu · April 5, 2023, 8:22am

Thanks @Orup70

I am a bit confused, I feel it defeats the purpose of [weak self].
I am not sure if I am missing something.

In Example 2 if I modify to to receive the a notification, it still deallocates the model.

Example 3 (modification of example 2):

    private func startObservingNotifications() {
        Task { [weak self] in
            for await _ in NotificationCenter.default.notifications(named: Notification.Name("hello")) {
                self?.doSomething()
            }
        }
        Task {
            let notification = Notification(name: .init("hello"))
            NotificationCenter.default.post(notification)
        }
    }
    
    private func doSomething() {
        print("do something")
    }

Output of Example 3:

init
do something
ideally should deallocate after this
deinit

Orup70 · April 5, 2023, 10:14am

I know, it’s tricky to reason about weak and strong references.

But in your Example 3, you now send and receive a notification:

private func startObservingNotifications() {
        Task { [weak self] in
/* 1 */     for await _ in NotificationCenter.default.notifications(named: Notification.Name("hello")) {
/* 2 */         self?.doSomething()
            }
        }
        Task {
            let notification = Notification(name: .init("hello"))
/* 3 */     NotificationCenter.default.post(notification)
        }
    }
    
    private func doSomething() {
        print("do something")
    }

At line 1, you await the next notification and the reference to self is weak as you declared in the capture list above.

When you send the notification from the other task at line 3, the notification is received in the for loop and you reach line 2:

/* 2 */         self?.doSomething()

Here the code needs to bind a strong reference to self to be able to call doSomething(). But this strong reference is released directly after the call and you only have a weak reference left to self again.

This means, only for a short moment while a received notification is handled, you have a strong reference to self. The rest of the time, self is weak.

In your code you create a model object with a strong reference in the last task and then sleeps for a while and then release the strong reference to the model object when the task is finished. This is the only strong reference to the model object except for the temporary strong reference in line 2.

If you don’t hold a temporary strong reference in line 2, the reference count will be decreased to 0 and deinit will be called. And even in that case, the reference counter will reach 0 after line 2 and deinit will called anyway.

The difference in Example 1 is that you don’t temporarily bind a strong reference to self, since you call a function on self that never returns and the implicit binding to self is kept alive forever and the reference count will never reach 0:

private func startObservingNotifications() {
        Task { [weak self] in
/* 4 */     await self?.observeNotifications()
        }
    }

In line 4, this is similar to writing:

if let strong_self = self {
    await strong_self.observeNotifications()
}

The strong reference will only be released after returning from the if statement, but the function never returns and the reference to self (or your model object in this case) will never be relased and reach a count of 0.

somu · April 5, 2023, 10:33am

@Orup70 thanks

This could often trip me, bear with me as I try to understand.

So what is the point of Task accepting a capture list when it always will capture strongly?

Orup70 · April 5, 2023, 11:05am

When you use [weak self] in the capture list for Task, you actually do capture self weakly. That's why your own examples 2 & 3 works:

private func startObservingNotifications() {
    Task { [weak self] in
        for await _ in NotificationCenter.default.notifications(named: Notification.Name("hello")) {
            self?.doSomething()
        }
    }
}

The problem is in your first example:

private func startObservingNotifications() {
    Task { [weak self] in
/* 1 */ await self?.observeNotifications()
    }
}

Here you also capture self weakly as specified in the capture list. But on line 1:

/* 1 */ await self?.observeNotifications()

You tell the compiler to implicitly bind a strong reference to self and you call a function that never returns, thus keeping the strong binding alive.

But, the problem here is that your code waits on the function observeNotifications(), instead of just starting up the notification observation loop. Like you do in the other examples.

When you use implicit binding to weak self with self?.function() it's quite easy to miss that you actually do a "weak-strong-weak"-dance with the reference.

I usually, try to avoid this pattern and make an explicit strong reference to self. Makes it easier to understand what's going on when reading the code a few weeks later. At least for me...

somu · April 5, 2023, 11:27am

@Orup70 thanks a lot!!!

Wow there is so much to unpack here, thanks for patiently clarifying the doubts I had.

What you said makes sense, I tried the following and it seems to confirm your explanation.

Example 4 (modification of example 1):

This works as expected.
I completely forgot about async let comes in handy here, hopefully it is ok to use it like this

    private func startObservingNotifications() {
        Task {
            async let _ = observeNotifications()
        }
    }

Output for Example 4

init
ideally should deallocate after this
deinit

somu · April 5, 2023, 11:55am

@Orup70

That last example 4 doesn't hold a strong reference to self however it doesn't observe any notifications.

So actually example 4 is not a viable solution

Following shows that notification is not observed:

    private func startObservingNotifications() {
        Task {
            async let _ = observeNotifications()
        }
        
        Task {
            let notification = Notification(name: .init("hello"))
            NotificationCenter.default.post(notification)
        }
    }

So I suppose example 2 is the only way I can think of

Not sure:

May be compiler feels no one awaits the result of the so it is not worth executing observeNotifications

Orup70 · April 5, 2023, 12:26pm

I guess this is because when you don't use the result of an async let variable/task at the end of the current scope, the task will get cancelled and then be awaited on to be completed before leaving the scope.

So probably the for-await loop is detecting the cancelling of the task and ends the loop. But I'm not completely sure.

Try inserting a print statement after the for-await loop to see if the loop is ended and maybe print Task.isCancelled to verify?

somu · April 5, 2023, 2:20pm

@Orup70 Thanks a lot!

Yeah print definitely helped, the Task.isCancelled is false, the last line of the task is executed.

I suppose it is not awaited by anything so just ends.

Gero · April 5, 2023, 2:41pm

I think example 1 is clear, but for those coming late to the party, let me rephrase that

await self?.observeNotifications()

a bit: Even though the await is in front of the self?, what actually happens is this: The (asynchronous) function observeNotifications() is only called if self is not nil. If it is called, it is awaited, as the keyword implies. That's the moment when self is captured "again". It has to be, otherwise how can the function run?
Basically the line translates to:

if let unwrappedSelf = self {
    await unwrappedSelf.doSomething()
}

In the end, the structured concurrency runtime itself "captures" the object, in a way. The order of the keyword and the optional chaining syntax might look weird, but there's not a realistic different way to do this.

self?. await doSomething()

looks really weird, not to say terrible, doesn't it?

async let would principally work the same, alas, as the function does not return anything, the above mentioned optimization comes into play (i.e. the task is cancelled as nothing is returned and thus nothing further is done). I guess that bites you in the behind, but after all trying to rely on (assumed) side effects is never a good idea.

somu · April 5, 2023, 3:00pm

@Gero Thanks a lot!

I kept thinking [weak self] was a free ride all the way without holding a strong reference ever.

This is more clear if self was nil it would never start and once the member function's (async or non-async) execution starts the runtime (not sure what to call it ;) ) holds a strong reference to self till it completes.

This is in line with what @Orup70 had stated.

Thanks a lot guys really helpful, I am slowly beginning to understand.

Gero · April 5, 2023, 7:41pm

That is totally understandable and as a matter of fact, your example raises an interesting aspect about the syntax (combining await with optional chaining) that I hadn't thought about yet.

Exactly. If self had become nil before the task actually started (well, at least before it got to that line) it would not start (I believe, haven't actually tried it). This is basically impossible in your example, as it is the very first thing the task does, but imagine there was another asynchronous operation happening before that line (perhaps something with completion handlers or an explicit lock, i.e. something that does not use structured concurrency, but "classic" concurrency). Then, I assume, the function would not be started and thus not be awaited and it would all be good.

The order of keyword and optional chaining is unfortunate, but I want to point out that this example is a bit contrived: You have a never returning function there, but tasks are usually supposed to eventually be done in most cases. I don't know if it is a good idea to basically use one (especially one that is not detached) to basically implement some sort of run-loop. And even if you do, having some sort of self reference included somewhere then probably makes little sense.

Still, this is a great illustration of how the runtime, structured concurrency and the dangers of retain cycles play out (and a rare case in which weak self doesn't protect you).

Joe_Groff · April 5, 2023, 10:21pm

I think it's important to point out there's no actual retain cycle here; nothing has a strong reference to a task's context other than the task's own execution. The issue is really that there's a paradox in this kind of code: you're trying to say "run this code on the object until the object goes away", but the code itself is keeping the object alive in order to continue operating on it, so the object never goes away, and the code runs forever. Even if Swift had a cycle-breaking GC, it wouldn't solve the problem (except maybe indirectly, since in a GC language finalizers generally run late if they run at all, so you might be less likely to expect anything from them).

You could take asynchrony out of the equation and imagine similar synchronous code:

weak var foo = someObject
// doSomething to foo until it dies
while let currentFoo = foo {
  currentFoo.doSomething()
}

This similarly keeps a strong reference currentFoo around for almost the entire execution of the loop, except for the brief time between iterations. If it's another thread's responsibility to end the lifetime of someObject, this loop may still observe that end of life later than you'd expect, or miss it entirely, similar to the issue with the async task-based code. For example, if there were a similar while let currentFoo = foo loop running in one or more other threads, you may never reach a moment where all those threads simultaneously let go of their strong references between trips. There's no reference cycle, and arguably no true memory leak since the object is still live, but nonetheless there's a logical "leak" in the way the code is structured.

It's a tricky problem, and definitely one that could serve with some help from the language and/or libraries, since "run a task for an object's lifetime" is a natural thing to want to do. For the time being, though, it's probably more robust to have an explicit task-ending event, rather than try to use the object lifetime to implicitly end observation. One upside is that, if you do have an operation that explicitly ends the task, you don't ever need weak self to try to avoid cycles, since a completed task holds no permanent references to any objects it used during its execution, so everything should clean up nicely without manual intervention.

somu · April 6, 2023, 5:01am

@Joe_Groff Thanks a lot for that detailed explanation.

I have some doubts.

How I got into this situation:

In my real project I have a scenario where the async function f2 needs to be executed in sequence (can't run in parallel) when a notification is received.
So I thought for await would be a good idea (not sure) to solve the problem.
Later I realised for await was never going to end and would prevent the subsequent lines executing.
So I put it inside a task.

Questions:

Would cancelling the task in the deinit be a reasonable solution (as shown below)?
Or there is better way to execute an async function in sequence observing a notification?
In the below code does it make sense to use [weak self] inside the Task? (removing [weak self] causes deinit not to be called). I am assuming task doesn't get completed (because indefinitely waits for the next notification) and therefore doesn't give up the strong reference of self.

Example 5

import Foundation

@MainActor
class Model {
    
    private var task: Task<(), Never>?
    
    init() {
        print("init")
        startObservingNotifications()
    }
    
    deinit {
        print("deinit")
        task?.cancel()
    }
    
    private func startObservingNotifications() {
        task = Task { [weak self] in
            for await _ in NotificationCenter.default.notifications(named: Notification.Name("hello")) {
                await self?.doSomething()
            }
            print("task ended")
        }
    }
    
    private func doSomething() async {
        
    }
}

//Invoking
Task {
    let model = Model()
    try? await Task.sleep(nanoseconds: 2_000_000_000)
    print("ideally should deallocate after this")
}
RunLoop.main.run()

Gero · April 6, 2023, 8:53am

Oh, yes, that was not well phrased on my part. I guess "retain cycles" have kind of become a term often used to describe general unexpected ownerships (which are perceived as memory leaks because you don't realize why the memory is not freed), so people (or at least me) suspect a "cycle" as the first suspect when we see that.

I was aware that's not the case here, as it is not "object owns task and task also owns object".
Rather it is "object starts task, task does not capture object, but awaits the object being done with its doSomething method which never returns and then object is never done". I guess you could say that the execution of the doSomething method itself is what "captures" the object.

In my experience people tend to forget that if a method is defined on a (reference) type, the instance of that type is passed to the method, even if said method does not use it at all. So in this case here while doSomething is running, self (i.e. the model) cannot be released. That's what it means if a function is defined "on" a type.
So the task is not being done (and thus deallocated, I guess, but that's unimportant here) because doSomething runs forever. Then since self is caught "in" doSomething, it is also never released.

This funnily means that the following works:

Example 6 (modified from 1)

@MainActor class Model {
    init() {
        print("init")
        startObservingNotifications()
    }
    deinit {
        print("deinit")
    }

    private func startObservingNotifications() {
        Task {
            print("starting to listen")
            for await _ in NotificationCenter.default.notifications(named: Notification.Name("hello")) {
                print("got noti")
            }
            print("stopping to listen") // we never get here!
        }
    }
}

//Invoking
Task { @MainActor in
    var model: Model? = Model()
    try? await Task.sleep(nanoseconds: 2_000_000_000)
    NotificationCenter.default.post(Notification(name: Notification.Name("hello")))
    try? await Task.sleep(nanoseconds: 2_000_000_000)
    model = nil
    print("ideally should have been dealloced before this, check: \(model == nil)") // check just to silence a warning...
    try? await Task.sleep(nanoseconds: 2_000_000_000)
    NotificationCenter.default.post(Notification(name: Notification.Name("hello")))
    try? await Task.sleep(nanoseconds: 2_000_000_000)
    print("leaving outer task")
}
RunLoop.main.run()

Output for Example 6

init
starting to listen
got noti
deinit
ideally should deallocate after this, check: true
got noti
leaving outer task

Note that the body in the task is now a completely anonymous closure, not tied to Model at all. There is no method that captures the instance as there is no method that is defined on the type used here. This basically lets the task "escape" (colloquially speaking) from the lifetime of the instance of Model.

I wholeheartedly agree. I don't think tasks are a good fit (atm) to basically create something like simple run-loop like, never ending structure. The name kind of implies that, I might add, but this thread shows people can easily be misled.

In my experience structured concurrency is, at the moment, not a good fit if what you want is basically a serial execution queue, especially if the execution items are calls to async functions themselves. I think once we can define our own (serial) custom executors for actors that might be easier, but I am not super sure about that for async functions.

Regarding your questions in general: As long as doSomething is no longer running forever I think that solution is actually quite nice. You properly tie the task's lifetime to the instance, i.e. it gets cancelled when the instance is deallocated. I don't see why that would be a problem, other than perhaps writing a unit test to ensure nobody else eventually messes with that. If that task?cancel() ever gets removed you in effect create a forever listening task that does nothing with the notifications it receives once the creating instance is deallocated. But those things are a dime a dozen in coding...

I am perhaps not a pro about designing things with structured concurrency, so I am curious what others say.

danhalliday · April 6, 2023, 11:21am

FWIW I’ve also just been through this same process of coming to understand why my use of Tasks was completely broken and causing objects to not get released.

My use case is view models in an iOS app which use long-running tasks to track the state of dependencies, and my solution is to take advantage of SwiftUI’s .task lifecycle helper. This is a straightforward way to get all long-running tasks cancelled at a definite point in time, after which all references to the model are gone and it can be released after its view disappears. I gave some more thoughts here:

somu · April 6, 2023, 12:21pm

@danhalliday Thanks a lot! That is a really good idea to use SwiftUI's .task { } modifier, keeps it clean.

I didn't have to explicitly cancel the task, it gets cancelled when the view is dismissed.

JaanusK · May 21, 2023, 8:43am

I just ran into this, and wanted to share a recipe that works for me. I have a SwiftUI View with the associated ViewModel that’s owned by the view as a StateObject. I want the viewModel to be able to react to a bunch of notifications with the modern async API and long-running AsyncSequences, and for everything to be deallocated correctly when the view ends. I don’t want to code anything by hand and want things to happen automatically with the highest-level code possible.

Here’s what I came up with, that seems to work.

class MyViewModel: ObservableObject {
  deinit {
    // just for observability that it’s released correctly
    print("MyViewModel deinit")
  }
}

struct MyView: View {
  @StateObject private var myViewModel: MyViewModel()

  var body: some View {
    WhateverView()
    .task {
      for await _ in NotificationCenter.default.notifications(named: .notification1) {
        myViewModel.doSomething1()
      }
    }
    .task {
      for await note in NotificationCenter.default.notifications(named: .notification2) {
        if let receivedObject = note.object as? SomeObject {
          await myViewModel.ingestObject(receivedObject)
        }
      }
    }
  }
}

This seems to work correctly. I can start as many long-running listener tasks as I want, and they are all torn down and the view model object is deallocated when the view is closed which automatically cancels the tasks.