[Pitch] Capturing values in async for loops

BrentM · July 22, 2023, 7:21pm

A common issue I’ve run into when working with long running and indefinite async sequences is that it’s very easy to cause a retain cycle.

let’s say your service is watching a notification

let stream = NotificationCenter.default.notifications(named: myNotificationName, object: nil)

today, my solution to this issue has been to do the following:

for try await event in stream {
    weak var weakSelf = self

    guard let weakSelf,  !Task.isCancelled else {
        return
    }

    weakSelf?.handleEvent(event)
}

This is kind of ugly.

It would be nice if I could weakly capture self like you can in a closure:

for try await event in stream { [weak self] in

    guard let self,  !Task.isCancelled else {
        return
    }
    
    self.handleEvent(event)
}

What are everyone’s thoughts on this? Maybe a capture group isn’t the best syntax. As we’re dealing with a loop instead of a closure.

An alternative I though of is allowing the use of the where keyword to escape the loop if the value is nil:

for await event in stream where weak self != nil {
    !Task.isCancelled else {
        return
    }
   handleEvent(event)
}

Or a new capturing keyword (which would have to be placed after any where clauses you might have):

for await event in stream capturing weak self {
    guard let self,  !Task.isCancelled else {
        return
    }
    self.handleEvent(event)
}

tera · July 23, 2023, 2:41am

Shouldn't self be captured weakly outside? E.g. like so:

    Task { [weak self] in
        for try await event in stream {
            guard let self,  !Task.isCancelled else {
                return
            }
            self.handleEvent(event)
        }
    }

BTW, don't you think this is easier?

    NotificationCenter.default.addObserver(forName: myNotificationName, object: nil, queue: nil) { [weak self] event in
        guard let self else { return }
        self.handleEvent(event)
    }

ibex10 · July 23, 2023, 1:34pm

tera:

BTW, don't you think this is easier?

    NotificationCenter.default.addObserver(forName: myNotificationName, object: nil, queue: nil) { [weak self] event in
        guard let self else { return }
        self.handleEvent(event)
    }

I like the simplicity.

By the way, can self ever become nil after the check?

guard let self else { return }
self.handleEvent (event) // can self ever be nil here?

tera · July 23, 2023, 2:54pm

No, it physically can't as you've made a strong copy (of a reference) here bounding self to a non optional value, similar to:

guard let selfCopy: TheType /* not optional! */ = self else { return }
selfCopy.handleEvent(event)

Philippe_Hausler · July 23, 2023, 5:44pm

Couple of notes:

The !Task.isCancelled is not really needed there since the stream will return nil (in well behaved AsyncSequence types) when it sees the task iterating is cancelled.

for await in syntax is just sugar (similarly to Sequence's for in syntax) for

var $iterator = stream.makeAsyncIterator()
while let event = await $iterator.next() {
}

so the weak self would need to be present for the while loop. This means that you are advocating:

while let event = await $iterator.next() capturing weak self {
}

as valid syntax; which I am not sure that is easily doable - especially when you then start to ask: what happens to self AFTER the loop?

Just as was pointed out

That is both valid syntax and idomatically correct to what is being expressed; self is captured weakly for the task that is iterating. There is however a problem that does not cover - functions, specifically member functions.

Consider this:

class Thing {
  func doAThing() { }

  func run<Source: AsyncSequence>(_ source: Source) async rethrows {
    for try await item in source {
       doAThing()
    }
  }

  func start() { Task { await run(someSource) }
}

What the problem scope would be is the capture of self. Instead the function run could specify how self is "captured" (forgive my presumption of spelling for the possible solution, obviously the spelling could be different).

  func run<Source: AsyncSequence>(_ source: Source) async rethrows { [weak self]
    for try await item in source {
       guard let self else { break }
       doAThing()
    }
  }

Ideally I think the expectation then would be that self would be borrowed for the prolog of the async function (e.g. before it executes any line inside the method) and then weakly stored such that it would be able to bind strong references to self like with guard and subsequently release them to ensure the calling convention is appropriately respected.

Because AsyncSequence is really just sugar for calling an async function repeatedly the same problem exists if you have any async function being called as such. So it would be better to find a spelling that is approachable enough such that this is a usable thing for both async iteration AND async function calls (and maybe even non-async things too...).

tera · July 23, 2023, 6:15pm

Philippe_Hausler:

  func run<Source: AsyncSequence>(_ source: Source) async rethrows {
    for try await item in source {
       doAThing()
    }
  }

  func start() { Task { await run(someSource) }

This would be an identical mistake I did earlier:

    Task { [weak self] in
        guard let self else { return } // HERE **************
        for try await event in stream {
            // guard let self else { return } // instead of HERE *********
            self.handleEvent(event)
        }
    }

Generally, could you take a fragment like this and refactor it into a function and expect everything will work as before?

    for try await event in stream {
        ...
    }

No. For example there could be a return statement inside:

    for try await event in stream {
        if ... { return }
    }
    print("This is not called upon returning above")

and that would not return to a correct spot:

func foo() {
    for try await event in stream {
        if ... { return }
    }
}

...
    foo()
    print("This is not called upon returning above")

As you mentioned another point of difference would be a difference in capturing "self" and in this case it would be important to choose a closure instead of a method to factor code in, as closure supports closure lists, which makes them different to functions or methods.

BTW, why there's no error about self capturing here?

    func bar() {}
    
    func foo() {
        DispatchQueue.main.async {
            bar() // 🛑 Call to method 'bar' in closure requires explicit use of 'self' to make capture semantics explicit
        }
        Task {
            while true {
                bar() // 🤔 OK?!
                try await Task.sleep(for: .seconds(1))
            }
        }
    }

Philippe_Hausler · July 23, 2023, 7:39pm

The signature of Task.init is this:

public init(
    priority: TaskPriority? = nil,
    @_inheritActorContext @_implicitSelfCapture operation: __owned @Sendable @escaping () async -> Success
  )

The @_implicitSelfCapture is what causes the self being captured to not error/warn; from my understanding is that attribute says "yup, things that use self must extend the lifetime of self so don't bother requiring it". Which in non-indefinite await cases that makes perfect sense per the calling convention of swift. However it breaks down when you have things that either a) await indefinitely as a single call, or b) await indefinitely because of multiple calls. Hence why I think that some sort of attribution to the method scope would be interesting to consider.

tera · July 23, 2023, 7:52pm

Thanks. I thought the punch line would be "Hence why I think @_implicitSelfCapture was a bad idea"

malhal · December 16, 2024, 11:47pm

Avoid the issue by making it static, eg

Self.handleEvent(event)

Or move the async func with the for await outside of self. Ie out of the class into its own struct.