Swift Concurrency Roadmap

I'd like to understand why we should be confident that this step results in a usable programming model.

While reentrant actors still technically eliminate data races, they create a very racy programming experience, akin to sharing a bunch of atomic properties across multiple threads. True, it's slightly different, because your thread of execution can only be interrupted where it issues an await instead of at arbitrary points, but if the actor's properties have any important relationships, before any await you have to be sure invariants are intact, and you have to be prepared for the actor's mutable properties to have “shifted underneath you” when the await is over. We use locks to avoid both of these effects in multithreaded code, and it seems the same basic kind of care is needed here.

IIUC part of the reason async/await has been so successful in other languages is that it lets us reason about async code using the same tools that apply to synchronous code, but reentrancy seems to substantially break that illusion. Common idioms, like using the properties of an instance to hold the state of a single larger computation, will break most spectacularly and with exactly the same kinds of unpredictability as if there were an actual data race, if deployed in the usual ways in reentrant actors.

I have yet to read through all the documents (a quick scan finds no mentions of reentrancy), so apologies if I've missed it, but it seems to me we at least need a set of “reentrant actor programming guidelines” if we're going to adopt this model. Does that exist somewhere?

10 Likes

I think this mis-states the motivation for weak self, at least in many cases. The reason weak self is used usually used in my experience is to avoid extending the lifetime of an object. For example, when fetching data from the network it is usually not desirable to keep an object alive to wait for the response. Instead, if the object is deinitialized there is no longer a need to load the data and the network request should be cancelled.

I've given quite a bit of thought to this aspect of the design. It definitely makes it a lot easier to unintentionally write code that will keep an object alive waiting for the result of a task which could take an arbitrarily long amount of time. On balance, I think the tradeoffs are probably right. But I think there need to be clear guidelines for best practices around when to use a detached task to separate task lifetime from object lifetime.

I'm having a hard time understand exactly what this means. Some example code would be very helpful.

I agree that re-entrancy will result in code that doesn't behave the way it looks. It will be necessary to drop all assumptions about the state of an actor at every await, including the assumption that locals derived from the state of an actor still reflect its state. That said, it looks to me like the advantages of this approach are significant. It looks like this is a tough tradeoff we have to make where both choices have serious downsides.

4 Likes

We definitely see lots of weak self in SwiftNIO programs that simply do not require it.

One thing I think we definitely need is a way to tie cancellation of a task to the lifetime of an object. There are a couple of ways I could see this being done:

  • Have a cancellation token that carries a weak reference to the object. Checking isCancelled can check whether the reference is still nonnull.
  • Have a form of task nursery that can be used as a stored property in a class, instead of being scoped, so that child tasks spawned in that nursery all get canceled when the object is destroyed and cleans up its nursery.
6 Likes

I agree with the goal behind your statement, but I don’t know that reference-based solutions would work, since any tasks would often be keeping the object alive. You need a higher-level “I’m done with this” step.

cc @Joe_Groff, @lukasa - From my own experience, canceling a task from deinit is easy: we have plenty of tools for that. What is dearly missing is the absolute and total confidence that there is no racy condition, unfortunate oversight, or any other bug, that could trigger the completion block to run despite task cancellation. Even seasoned developers who know an unowned reference is the best candidate prefer using a weak one, just because of this nagging doubt. The general description of cancellation as a cooperative and "best effort" feature makes nothing to lift this doubt, on the contrary.

// The ideal world
self.cancellable = publisher.sink { [unowed self] in 
    ...
}

// The pragmatic world
self.cancellable = publisher.sink { [weak self] in 
    guard let self = self else { return }
    ...
}

If Swift concurrency can address this doubt with a clear commitment, written somewhere in the documentation, that acknowledges that people are full of doubts and soothes their questions, then that would be great for unowned. On the first blog post that tells a story of a crash due to a deallocated object dereference, developers will turn to weak again without any regret.

Below is an example of a conversation that is totally reasonable, but ruins confidence anyway:

Dev: My app crashes (see sample code that involves an unowned ref)!
Support: Everything is behaving as expected because cancellation happens after the continuation was already scheduled on the actor's inner queue.
Dev: :neutral_face:

10 Likes

Yeah, this is what keeps me using weak self as well. In most contexts I've worked in, an unexpected app crash is a downright unacceptable UX outcome, even if the alternative is a generic "Something went wrong" message (or no UI response at all). Even if there were a clear commitment to an acceptable API contract, abandoning weak self requires confidence on the part of the developer that any bugs which violate that contract will a) be extremely rare, and b) be considered critical bugs which will be fixed in beta (or near-immediately if discovered in production releases).

Given the diverse priorities of the Swift community at large, (b) seems unreasonable to expect for any arbitrary language feature, and so for the time being coding defensively with weak self feels like the 'safest' alternative*.

* There's an infinite regress problem, of course, since mitigating via weak self requires confidence that bugs with weak references are also rare/considered critical, but its easier to have that confidence in a mature and extensively battle-tested feature than in a relatively young library/language feature.

6 Likes

This sounds like it would introduce something like the dispose pattern, something I would find very disappointing. I like Joe’s idea of a nursery that can be used as a stored property. If we had move only types I think it would make sense for this nursery to be move only (I can’t think of use cases where I would want to share references to it).

The way I can imagine an object interacting with the nursery is that it would submit tasks to the nursery and then receive their results when they complete. When the object is deinit’d all pending tasks are immediately cancelled and guaranteed not to return a result (including any cancellation error that is thrown).

Ideally there would be a way to design this so that programmers don’t need to capture self explicitly, but if that wasn’t possible, a firm guarantee that no tasks will call back after the object is no longer around would be sufficient to allow confident use of unowned when working with these nurseries.

1 Like

I think there's a disconnect here. There is really just no way that a task calling an async method on an object is not going to keep that object alive while the task is active. There's no way to express what you want with async functions. You need to be cancelling tasks independently of object lifetime. We can try to figure out ways to make that easier, maybe even implicit somehow, but I don't think fighting the natural lifetime rules of functions is the right approach.

2 Likes

Unless I'm mistaken, @anandabits says "object" for the client of the task, not the provider of the task or the task itself. The topic is the following: the client wants 1. the task to be cancelled when the client is deinited, and 2. the task to trigger some client's method when it completes. The task can't have a strong capture of the client, or the task would retain the client and prevent the client from being deinited before the task completes. The question is: can the task hold an unowned reference to the client and guarantee that the program won't crash, or should the task hold a weak reference to the client?

3 Likes

Maybe there is a disconnect, I'm still trying to get my mind around everything. Would you mind taking a look at my detached tasks example? This represents my understanding of the proposal as it is right now. If Joe's instance-owned nursery existed I would have tried to use that instead and believe there would be a lot less boilerplate.

Regardless, I think it is important for an object to be able to create tasks that are working on its behalf without keeping it alive (and are indeed cancelled when the object goes away). If that isn't going to be possible then it isn't clear to me how efficient resource management is supposed to happen in this system. To be honest, I rarely encounter use cases where I want a task to keep an object alive. Usually when an object goes away that means there is no longer a demand for the work it represents and any tasks running on its behalf should be cancelled immediately.

Not mistaken at all, this is exactly what I'm describing.

To be frank, I do not expect any answer to this question, except the trivial "it depends on how the task is implemented".

However, since this thread describes a "roadmap", then maybe the authors could consider what could be possible, at the language level, that would support tasks that welcome unowned references to their clients, without breaking memory safety, as long as cancellation happens in some specified concurrency context.

On a side note, whenever I read a regret, expressed by generally intelligent people, that too many developers use a weak reference instead of a strong or unowned reference, I can't help but sigh.

It's a concrete start. I really would like to see a native coroutine implementation to help me avoid having to reason too deeply about threading.

Hm, you're right that the task needs to independently own the object if it's manipulating it. However, maybe we could still detect when the task has the only remaining strong retain of the object using isKnownUniquelyReferenced as a cancelation signal?

That seems like a really suspect thing to do as a general rule. What you’re trying to suggest, I think, is that tasks should be automatically cancelled if they’ll have no visible effect; but that’s not really something we can decide in general.

The idea of having a nursery-like manager for creating tasks that are bound to an object and cancelled when it’s disposed seems like a good one. I do think you need to hook into an actual disposal step, though.

1 Like

To be clear, I'm thinking cancellation would still be cooperatively polled on the task side, not something "automatic" in the sense that the task instantaneously cease work once the object goes away or anything like that. We probably don't want to expose the exact mechanism to users, I agree, but I'm hoping we can still make the association of object lifetime with task cancellation relatively boilerplate-free.

Why is it problematic to have tasks automatically get cancelled at suspension points? As I mentioned in the structure concurrency thread this is what Scala’s Zio library does and they consider it a feature. It isn’t clear to me why you would want to allow a task to continue across a suspension point once it has been cancelled already.

Sometimes it is necessary to prevent cancellation but I think that is the wrong default. Relying exclusively on cooperative cancellation seems to me like a good way to waste resources.

I’m sure you’ve thought this through in great detail but the tradeoffs and rationale are not presented in the proposal so it’s hard for me to know whether I’m missing important details or whether we just have different opinions about the right design decision. It would be helpful if you can spell this out in more detail.

3 Likes

The biggest practical effect I could see is that every potential await becomes a potential unwind point, which makes any sort of transactional logic across async calls potentially problematic.

That is what uninterruptible is for in Zio. We would need to have something similar so in the occasional cases where this is important programmers can specify that a specific scope must run to completion once it begins. I think this model would be way easier to work with correctly. Cooperative cancellation doesn’t express the semantics as clearly or declaratively in code. It also requires additional effort on the part of programmers which is something that will often be overlooked.

The Zio developers put a high priority on preventing resource leaks by design. This is just one example of how that philosophy is manifested. IMO, it’s a state of the art concurrency library filled with good ideas.

1 Like
Terms of Service

Privacy Policy

Cookie Policy