For await loop wrapped in a Task { } block

Hi,

Overview

  • When for await loop is used, it runs (waits) indefinitely.
  • Tasks taskRed, taskGreen, taskBlue need to execute sequentially

Questions:

  1. Is it reasonable to wrap for await loop in a Task { } block so that other functions / code can execute?
  2. Or is there a better way to handle this (sample code below)?

Sample Code:

@MainActor
class Model: ObservableObject {
    let redNotificationName = Notification.Name("red")
    let greenNotificationName = Notification.Name("green")
    let blueNotificationName = Notification.Name("blue")
    
    var red = 0
    var green = 0
    var blue = 0
    
    init() {
        Task {
            for await _ in NotificationCenter.default.notifications(named: redNotificationName) {
                await taskRed()
            }
        }
        
        Task {
            for await _ in NotificationCenter.default.notifications(named: greenNotificationName) {
                await taskGreen()
            }
        }
        
        Task {
            for await _ in NotificationCenter.default.notifications(named: blueNotificationName) {
                await taskBlue()
            }
        }
    }
    
    func postRed() {
        NotificationCenter.default.post(name: redNotificationName, object: nil)
    }
    
    func postGreen() {
        NotificationCenter.default.post(name: greenNotificationName, object: nil)
    }
    
    func postBlue() {
        NotificationCenter.default.post(name: blueNotificationName, object: nil)
    }
    
    private func taskRed() async {
        print("Red started")
        try? await Task.sleep(nanoseconds: 3_000_000_000)
        red += 1
        print("Red ended = \(red)")
    }
    
    private func taskGreen() async {
        print("Green started")
        try? await Task.sleep(nanoseconds: 3_000_000_000)
        green += 1
        print("Green ended = \(green)")
    }
    
    private func taskBlue() async {
        print("Blue started")
        try? await Task.sleep(nanoseconds: 3_000_000_000)
        blue += 1
        print("Blue ended = \(blue)")
    }
}

Invoking

struct ContentView: View {
    @StateObject private var model = Model()
    var body: some View {
        VStack {
            Button("Red") {
                model.postRed()
            }
            Button("Green") {
                model.postGreen()
            }
            Button("Blue") {
                model.postBlue()
            }
        }
    }
}

In principle, it is fine. However, I have a couple of comments about the given implementation.

The first is that once you create a Task, it keeps running until it has produced a result or is cancelled. From the documentation:

Tasks can start running immediately after creation; you don’t explicitly start or schedule them. After creating a task, you use the instance to interact with it — for example, to wait for it to complete or to cancel it. It’s not a programming error to discard a reference to a task without waiting for that task to finish or canceling it. A task runs regardless of whether you keep a reference to it. However, if you discard the reference to a task, you give up the ability to wait for that task’s result or cancel the task.

Since these tasks are executing an infinite loop, every time a Model object is instantiated, you create and leak three tasks.

Now, SwiftUI has a particular way of managing the lifetimes of observable objects, so it tries to not repeatedly instantiate @StateObject values if your view's identity does not change, but:

  • If your view's identity does change (for example, because it is used in a ForEach), even @StateObject values will be recreated, and
  • Not all code uses SwiftUI's particular way of managing object lifetimes. It is good engineering practice to clean up any resources you allocate, rather than leak them.

We can simulate this. Here, I've set things up really simply - I've renamed your ContentView to ChildView, and created two of them with explicit IDs. There's a button which changes their IDs.

struct ContentView: View {
  @State var firstID = 0
  @State var secondID = 1

  var body: some View {
    VStack {
      GroupBox {
        ChildView()
          .id(firstID)
          .border(.black)
        ChildView()
          .id(secondID)
          .border(.black)
      }

      Button {
        firstID += 2
        secondID += 2
      } label: {
        Text("Change IDs")
      }
    }.padding()
  }
}

struct ChildView: View {
    @StateObject private var model = Model()
    var body: some View {
        VStack {
            Button("Red") {
                model.postRed()
            }
            Button("Green") {
                model.postGreen()
            }
            Button("Blue") {
                model.postBlue()
            }
        }.padding()
    }
}

image

In Model, we print some logging information in init before we create the tasks, and again inside each worker task when it is about to complete.

@MainActor
class Model: ObservableObject {
    // ...

    init() {
      print("Creating tasks")   // <----

      Task {
        defer { print("Worker task ending") }   // <----
        for await _ in NotificationCenter.default.notifications(named: redNotificationName) {
          await taskRed()
        }
      }

      // ...
    }
}

When I launch the application, I find that it prints "Creating tasks" twice - one for each Model object. That's expected. And when I click the button to change IDs, it prints "Creating tasks" twice again - because that's how @StateObject works. But - here's the critical thing - we never see "Worker task ending". Even worse, if we change IDs a couple of times to create a bunch of tasks, then click "Red" one time, all of those tasks (which are still running) get triggered and all print to the console.

image

So how do we fix this? The last sentence of the documentation snippet tells us - we need Model to keep references to the tasks it creates in init, so it can cancel them in its deinit and stop them living forever:

@MainActor
class Model: ObservableObject {
    let redNotificationName = Notification.Name("red")
    let greenNotificationName = Notification.Name("green")
    let blueNotificationName = Notification.Name("blue")

    var red = 0
    var green = 0
    var blue = 0

    var tasks = [Task<Void, Never>]()    // <----

    init() {
      print("Creating tasks")

      tasks.append(    // <-----
        Task {
          defer { print("Worker task ending") }
          for await _ in NotificationCenter.default.notifications(named: redNotificationName) {
            await taskRed()
          }
        }
      )

      // ... same for the other 2 tasks
  }

  deinit {
    tasks.forEach { $0.cancel() }    // <----
  }
}

What is interesting is that even if you run this, our worker tasks still won't get cleaned up. What gives?

Basically, what is happening is that in order for our tasks to access properties such as redNotificationName, or invoke callbacks such as taskRed(), they need to keep the instance of Model alive which they belong to. But then, that instance of Model also has a reference to the worker task. In other words, we have a retain cycle.

Typically, Swift guards against retain cycles by having you explicitly write self in order to access this data which prolongs the object's lifetime, but Task { ... } has a special exception which allows it to omit that. The reason is that Task { ... } is intended to be used with actors; when you call it from within an actor, it inherits the actor's context, and in that particular situation no lifetime extension actually occurs.

But in this case, we're not calling it from within an actor, and so it does extend the lifetime of our Model instance and create a retain cycle. It is discussed a bit more in this thread: Task.detached doesn’t allow implicit self

The solution is to use Task.detached { ... } instead of plain Task { ... }. If we make that one tiny change, the compiler suddenly and requires us to be explicit when we extend the lifetime of the Model object:

tasks.append(
  Task.detached {
    defer { print("Worker task ending") }
    for await _ in NotificationCenter.default.notifications(named: redNotificationName) {
    //                                                             ^
    // ERROR: Reference to property 'redNotificationName' in closure requires explicit use of 'self' to make capture semantics explicit
    
      await taskRed()
      //    ^
      // ERROR: Implicit use of 'self' in closure; use 'self.' to make capture semantics explicit
    }
  }
)

In general, explicit self is a really nice feature of Swift, and it's a bit of a shame that the compiler lets us make this mistake so easily.

In order to break the cycle, we can use a weak or unowned reference to self. Since the task will be cancelled in the object's deinit, this code should never even try to access properties self after the object's lifetime has ended, so I'm going to use an unowned reference:

tasks.append(
  Task.detached { [unowned self] in
    defer { print("Worker task ending") }
    for await _ in NotificationCenter.default.notifications(named: redNotificationName) {
      await taskRed()
    }
  }
)

And that's it! We're done. Now if we click the "Change IDs" button, we can see that 2 new Model objects are created, which causes the 2 previous objects to be destroyed (there's no retain cycle keeping them alive any more), which causes them to each cancel their 3 worker tasks. We're not leaking any tasks any more.

image

And as a final test - if we click "Change IDs" a bunch of times to remake the Model objects, then click "Red", we can see that there are only 2 worker tasks alive listening to notifications (one per Model object):

image

To summarise, 2 changes were needed:

  1. Hold on to the Task references in an array, use them to cancel the tasks in deinit

  2. Create the tasks using Task.detached { ... } rather than Task { ... }, and use weak/unowned captures of self to avoid the retain cycle

11 Likes

@Karl Thanks a lot for that detailed example.

Edited:

Given below are my thoughts (I could be wrong):

  • Yes as you rightly pointed out I noticed in my real project the @StateObject was getting initialised multiple times and therefore multiple calls to observation.

  • I feel we can still use Task { } just that I definitely didn't realise the self being captured strongly because Task { } doesn't syntactically require us to mention self.

  • I suppose with Task.detached we are forced to think about the type of ownership

  • deinit does get called even when Task { [weak self] in } is being used

Summary of my understanding (could be wrong):

  • As long as self is captured weakly in Task { [weak self] in } it is fine.
  • We don't need to explicitly cancel tasks, when Model gets deallocated the tasks it created would also be cancelled (this is what I think, based on testing) as long as there are no strong references in for await loop

thanks a lot!

Even though the closure passed to Task { ... } can create implicit strong references to self, it doesn't prevent you using a weak/unowned reference instead. All closures allow you to define a capture list.

It's just that other closures don't allow those implicit strong references to self. Using Task { ... } makes it easier to write accidental retain cycles. If you want to take that risk, that is up to you.

However, there is another difference between the two ways of creating a Task. SE-0304 Structured Concurrency explains it in a little more detail:

  • Task { ... } creates an unstructured task, which inherits priority, actor context, and task-local values.
  • Task.detached { ... } creates a detached task, which is a type of unstructured task that does not inherit any context.

Let's say the Model initialiser was called on a high priority task (which it may be, because it is constructed by SwiftUI while invoking the body property) - if our worker tasks were plain unstructured tasks, they would inherit that high priority, while detached tasks would not. Indeed, if you add:

print(Task.currentPriority)

to the worker tasks, you will see that the plain unstructured version prints TaskPriority(rawValue: 25), which is .userInitiated - the highest level. The detached version prints TaskPriority(rawValue: 21), which is .default/.medium.

I think a detached task is more appropriate - there is no reason for these workers to inherit context from the Model initialiser. They are not semantically part of the same task which creates the Model object, so they shouldn't inherit its priority or inherit all of its task-local values.

That is not correct. See the documentation snippet I linked to above. Here is the same information expressed in the structured concurrency proposal:

Tasks run to completion even if there are no remaining uses of their task handle, so it is not necessary to retain the task handle or observe its value for the task to complete. However, the task handle can be used to explicitly cancel the operation

In the case of a notification listener, the task will never complete on its own. It does not calculate a result; it just listens for notifications indefinitely, and the only way to stop the task is to cancel it (the notification AsyncSequence will react to this cancellation event by finishing iteration, letting the for await loop complete only after cancellation). Therefore, you need to keep its task handle around, so you can send that cancellation signal.

I don't know what you're seeing, but allow me to take a guess - have you captured a weak self rather than an unowned self? And so inside of you for await loop, you call self?.taskRed()?

If so, after the model objects are destroyed, self will be nil, the calls to taskRed will not succeed - but those tasks are actually still leaked; they're still alive, still processing notifications. If that is what you're doing, add some logging inside the for await loop, before the call to taskRed, and you should see that the tasks are still alive even after the captured self has become nil.

Hope that helps.

3 Likes

@Karl thank you so much for making it clear

I definitely have some misconceptions about concurrency, so it is nice to get it slowly get them clarified.

I am not sure what has happening in my real project, I will try to isolate code and test again

I did a small POC, you were right the task doesn't get cancelled when the model gets deallocated

class Model {
    
    let task: Task<Void, Never>
    init() {
        task = Task {
            for await _ in NotificationCenter.default.notifications(named: .init("hello")) {
            }
            print("task has completed")
        }
    }
    
    deinit {
        print("deinit model")
//        task.cancel() //This is mandatory for the task to be cancelled
    }
}

do {
    let model = Model()
    
    for _ in 1...100000 {}
}

//Just give some more time for the task to be cancelled
for _ in 1...100000 {}

print("end")

Thanks a lot!!!

1 Like

After doing this a few times I noticed a pattern so came up with this (not sure if it is a good idea or not):

import Combine

public extension AsyncSequence {
    func sink(receiveValue: @escaping ((Element) async -> Void)) -> AnyCancellable {
        let task = Task {
            for try await element in self {
                await receiveValue(element)
            }
        }
        
        return AnyCancellable {
            task.cancel()
        }
    }
}
cancellable = NotificationCenter.default.notifications(named: .init("hello")).sink {
    // Handle notification
}

As an alternative to starting 3 unstructured tasks, which I prefer to keep to a minimum, I find nicer to start only 1 (needed to get an async context) and then use a task group. Of course that means you need to be a bit careful with cancellation but my brain understand it better ^^

Another point of discussion is the fact that this is very tricky to test since Task scheduling is not deterministic so what often happens is that if you instantiate your object on tests, and then send a notification, tests will fail because the tasks didn't start yet and thus nothing was listening for the notifications. I hope Swift has testing tools soon cause it's a bit of a nightmare right now.

3 Likes

To add to @Alejandro_Martinez comment, be sure to check out Apple's swift-async-algorithms package. It is still under development but provides some features that might come in handy.

For the usecase in this thread, merge(_:...) might be what you are looking for. If you explore its implementation, i.e., MergeStorage.swift, you'll see that they implemented it by wrapping a Void.self throwing task group in an unstructured Task block to manage the lifespans of 3 asynchronous sequences.

1 Like