Swift Concurrency Roadmap

gwendal.roue · November 2, 2020, 4:27pm

cc @Joe_Groff, @lukasa - From my own experience, canceling a task from deinit is easy: we have plenty of tools for that. What is dearly missing is the absolute and total confidence that there is no racy condition, unfortunate oversight, or any other bug, that could trigger the completion block to run despite task cancellation. Even seasoned developers who know an unowned reference is the best candidate prefer using a weak one, just because of this nagging doubt. The general description of cancellation as a cooperative and "best effort" feature makes nothing to lift this doubt, on the contrary.

// The ideal world
self.cancellable = publisher.sink { [unowed self] in 
    ...
}

// The pragmatic world
self.cancellable = publisher.sink { [weak self] in 
    guard let self = self else { return }
    ...
}

If Swift concurrency can address this doubt with a clear commitment, written somewhere in the documentation, that acknowledges that people are full of doubts and soothes their questions, then that would be great for unowned. On the first blog post that tells a story of a crash due to a deallocated object dereference, developers will turn to weak again without any regret.

Below is an example of a conversation that is totally reasonable, but ruins confidence anyway:

Dev: My app crashes (see sample code that involves an unowned ref)!
Support: Everything is behaving as expected because cancellation happens after the continuation was already scheduled on the actor's inner queue.
Dev:

Jumhyn · November 2, 2020, 5:13pm

Yeah, this is what keeps me using weak self as well. In most contexts I've worked in, an unexpected app crash is a downright unacceptable UX outcome, even if the alternative is a generic "Something went wrong" message (or no UI response at all). Even if there were a clear commitment to an acceptable API contract, abandoning weak self requires confidence on the part of the developer that any bugs which violate that contract will a) be extremely rare, and b) be considered critical bugs which will be fixed in beta (or near-immediately if discovered in production releases).

Given the diverse priorities of the Swift community at large, (b) seems unreasonable to expect for any arbitrary language feature, and so for the time being coding defensively with weak self feels like the 'safest' alternative*.

* There's an infinite regress problem, of course, since mitigating via weak self requires confidence that bugs with weak references are also rare/considered critical, but its easier to have that confidence in a mature and extensively battle-tested feature than in a relatively young library/language feature.

anandabits · November 2, 2020, 7:09pm

This sounds like it would introduce something like the dispose pattern, something I would find very disappointing. I like Joe’s idea of a nursery that can be used as a stored property. If we had move only types I think it would make sense for this nursery to be move only (I can’t think of use cases where I would want to share references to it).

The way I can imagine an object interacting with the nursery is that it would submit tasks to the nursery and then receive their results when they complete. When the object is deinit’d all pending tasks are immediately cancelled and guaranteed not to return a result (including any cancellation error that is thrown).

Ideally there would be a way to design this so that programmers don’t need to capture self explicitly, but if that wasn’t possible, a firm guarantee that no tasks will call back after the object is no longer around would be sufficient to allow confident use of unowned when working with these nurseries.

John_McCall · November 2, 2020, 7:34pm

I think there's a disconnect here. There is really just no way that a task calling an async method on an object is not going to keep that object alive while the task is active. There's no way to express what you want with async functions. You need to be cancelling tasks independently of object lifetime. We can try to figure out ways to make that easier, maybe even implicit somehow, but I don't think fighting the natural lifetime rules of functions is the right approach.

gwendal.roue · November 2, 2020, 7:43pm

Unless I'm mistaken, @anandabits says "object" for the client of the task, not the provider of the task or the task itself. The topic is the following: the client wants 1. the task to be cancelled when the client is deinited, and 2. the task to trigger some client's method when it completes. The task can't have a strong capture of the client, or the task would retain the client and prevent the client from being deinited before the task completes. The question is: can the task hold an unowned reference to the client and guarantee that the program won't crash, or should the task hold a weak reference to the client?

anandabits · November 2, 2020, 7:46pm

Maybe there is a disconnect, I'm still trying to get my mind around everything. Would you mind taking a look at my detached tasks example? This represents my understanding of the proposal as it is right now. If Joe's instance-owned nursery existed I would have tried to use that instead and believe there would be a lot less boilerplate.

Regardless, I think it is important for an object to be able to create tasks that are working on its behalf without keeping it alive (and are indeed cancelled when the object goes away). If that isn't going to be possible then it isn't clear to me how efficient resource management is supposed to happen in this system. To be honest, I rarely encounter use cases where I want a task to keep an object alive. Usually when an object goes away that means there is no longer a demand for the work it represents and any tasks running on its behalf should be cancelled immediately.

anandabits · November 2, 2020, 7:47pm

Not mistaken at all, this is exactly what I'm describing.

gwendal.roue · November 2, 2020, 8:50pm

To be frank, I do not expect any answer to this question, except the trivial "it depends on how the task is implemented".

However, since this thread describes a "roadmap", then maybe the authors could consider what could be possible, at the language level, that would support tasks that welcome unowned references to their clients, without breaking memory safety, as long as cancellation happens in some specified concurrency context.

On a side note, whenever I read a regret, expressed by generally intelligent people, that too many developers use a weak reference instead of a strong or unowned reference, I can't help but sigh.

bnowzi · November 3, 2020, 2:01am

It's a concrete start. I really would like to see a native coroutine implementation to help me avoid having to reason too deeply about threading.

Joe_Groff · November 3, 2020, 4:33pm

Hm, you're right that the task needs to independently own the object if it's manipulating it. However, maybe we could still detect when the task has the only remaining strong retain of the object using isKnownUniquelyReferenced as a cancelation signal?

John_McCall · November 3, 2020, 5:41pm

That seems like a really suspect thing to do as a general rule. What you’re trying to suggest, I think, is that tasks should be automatically cancelled if they’ll have no visible effect; but that’s not really something we can decide in general.

The idea of having a nursery-like manager for creating tasks that are bound to an object and cancelled when it’s disposed seems like a good one. I do think you need to hook into an actual disposal step, though.

Joe_Groff · November 3, 2020, 5:47pm

To be clear, I'm thinking cancellation would still be cooperatively polled on the task side, not something "automatic" in the sense that the task instantaneously cease work once the object goes away or anything like that. We probably don't want to expose the exact mechanism to users, I agree, but I'm hoping we can still make the association of object lifetime with task cancellation relatively boilerplate-free.

anandabits · November 3, 2020, 6:03pm

Why is it problematic to have tasks automatically get cancelled at suspension points? As I mentioned in the structure concurrency thread this is what Scala’s Zio library does and they consider it a feature. It isn’t clear to me why you would want to allow a task to continue across a suspension point once it has been cancelled already.

Sometimes it is necessary to prevent cancellation but I think that is the wrong default. Relying exclusively on cooperative cancellation seems to me like a good way to waste resources.

I’m sure you’ve thought this through in great detail but the tradeoffs and rationale are not presented in the proposal so it’s hard for me to know whether I’m missing important details or whether we just have different opinions about the right design decision. It would be helpful if you can spell this out in more detail.

Joe_Groff · November 3, 2020, 6:12pm

The biggest practical effect I could see is that every potential await becomes a potential unwind point, which makes any sort of transactional logic across async calls potentially problematic.

anandabits · November 3, 2020, 6:33pm

That is what uninterruptible is for in Zio. We would need to have something similar so in the occasional cases where this is important programmers can specify that a specific scope must run to completion once it begins. I think this model would be way easier to work with correctly. Cooperative cancellation doesn’t express the semantics as clearly or declaratively in code. It also requires additional effort on the part of programmers which is something that will often be overlooked.

The Zio developers put a high priority on preventing resource leaks by design. This is just one example of how that philosophy is manifested. IMO, it’s a state of the art concurrency library filled with good ideas.

michelf · November 3, 2020, 7:05pm

Interleaving partial tasks in a single actor is going to have a pretty similar effect of breaking transactional logic, no? If the default is going to be problematic anyway, I don’t see why we shouldn’t unwind at await.

John_McCall · November 3, 2020, 8:45pm

Honestly, I'm leaning the opposite direction — we may want to remove some of the implicit cancellation checks that are already there, e.g. so that Handle.get() always reports the task's true result. Making await always check and throw on cancellation would make it extremely frustrating to write a task that can limp through and finish its work despite cancellation, which is sometimes quite useful.

EDIT: although maybe those use cases aren't a good fit for cancellation.

Sajjon · November 3, 2020, 9:12pm

@Ben_Cohen Ping Swift Concurrency Roadmap - #69 by Sajjon

Also regarding my question about XCTest - would it maybe be interesting to mark test cases making use of await to be executed in parallel? Imagine I have 5 test cases each testing some REST endpoint and don’t really care about execution order of these and instead of executing test0 and waiting 3 (slow server) seconds for the test to finish awaiting for response before test1 that also takes 3 seconds to finish etc, I could start all 5 test in parallel, and they all finish after 4 seconds in total, instead of having an execution time of 15 seconds. Is that someone that could be done?

I know there’s already an option to run tests in parallel today, but figured it might not work as I described above out of the box?

I might have misunderstood something of course...

Best regards
Alex

anandabits · November 3, 2020, 10:00pm

As I mentioned, the way this is addressed in Zio is with uninterruptible. In the language-native approach concurrency model pitched here that might look something like this:

extension MyActor {
    func doSomethingImportant() async {
        // opt-out of implicit cancellation
        Task.uninterruptible {
            // do important work
        }
    }
}

The task increments the uninterruptible count when entering an uninterruptible scope and decrements it when leaving that scope. Implicit cancellation only happens when the count is zero.

(Note: I think there is probably a better name than uninterruptible but don't have time to bikeshed this right now)

Lantua · November 3, 2020, 10:05pm

What if we have code be uninterruptible by default, and have a portion of code be marked interruptible instead?

PS

Maybe we should move onto Structured Concurrency?