[Pitch #2] Structured Concurrency

While the nuance is interesting, I'm not sure it's ultimately useful to most users. Instead, I think overloading async, while technically inaccurate, describes the behavior as most users would expect: it's an asynchronous value. Introducing new vocabulary that doesn't enable more correct thinking for most users isn't really beneficial.

3 Likes

They're handles to the task and can be used to cancel it or wait for it to complete. They're not intended for observing a changing value; that would be more like the AsyncSequence idea.

The overhead is somewhere around 100-200 bytes, plus whatever local state the task is currently tracking. We think that's very low. The back-pressure ideas are somewhat hand-wavey at the moment. The goal is that you generally shouldn't have to think about minimizing the number of tasks you're creating, at least for memory concerns; the ability of the system to actually run your tasks is a separate issue.

The current idea of a cancellation handler is internal to the task: they allow the task to handle cancellation in special ways, they don't allow other things to subscribe to whether an existing task is cancelled.

1 Like

I probably haven't made my point clear enough. So I will elaborate:

I can very well imagine this function implementation in the wild using the current state of the proposal (a similar function was posted by @anandabits in the original pitch thread):

/// Runs two functions concurrently and returns the result of the first to complete.
/// Both functions need to check for cancellation regularly.
func race<R>(method1: () async -> R, method1: () async -> R) async throws -> R {
    try await Task.withGroup(resultType: R.self) { group in
        await group.add { method1() }
        await group.add { method2() }
        
        let firstResult = try await group.next()! // 1
        return firstResult // 2
    }
}

With the current state of the proposal, this implementation is broken. It will always wait for both methods to finish before returning. And this bug is very hard to find. A correct implementation will need to add group.cancelAll() between lines 1 and 2.

An expert on structured concurrency will maybe spot this. However, I argue that it is easy to make this mistake for a normal programmer. (This exact mistake has been made or overlooked by at least two people in the original pitch thread.) My philosophy is to design APIs in a way to make mistakes hard to make. Humans will always make mistakes.

One could tackle this problem by changing the proposal so that remaining tasks are always cancelled when the body of withGroup() returns. However, some people will probably assume that all tasks started in a task group are completed unless explicitly cancelled. This would lead to race conditions.

@ktoso suggested the following solution:

My suggestion is that the body of withGroup() needs to return an enum:

extension Task.Group {
    enum ReturnBehavior<R> {
        case afterAwaitingRemainingTasks(R)
        case afterCancellingRemainingTasks(R)
    }
}

With this, the example would read:

/// Runs two functions concurrently and returns the result of the first to complete.
/// Both functions need to check for cancellation regularly.
func race<R>(method1: () async -> R, method1: () async -> R) async throws -> R {
    try await Task.withGroup(resultType: R.self) { group in
        await group.add { method1() }
        await group.add { method2() }
        
        let firstResult = try await group.next()!
        return .afterCancellingRemainingTasks(firstResult)
    }
}

Everybody would know exactly how the remaining tasks are handled. In addition, the fact that the return call is the last line of the withGroup() call helps the mental model that the cancelling or awaiting takes place after the body has run and before withGroup() returns.

A disadvantage of this solution is that it is pretty verbose (maybe a shorter spelling can be found). However, Swift has a history of preferring explicitness over terseness. In addition, withGroup() will probably not be used directly in many code bases. Instead, I predict that most people will use convenience functions that use withGroup() internally.

3 Likes

I understand the attraction of using syntax with async to indicate that these functions can only be used in an async context, but I think it’s too clever. The concept of functions that don’t contain suspension points and therefore aren’t actually asynchronous already exists: regular synchronous functions. I’d strongly prefer something like @michelf’s suggestion of @taskLocal (or @asyncContextOnly or whatever).

8 Likes

I think you may be misunderstanding what cancellation is. It’s cooperative, so cancelled tasks have to support cancellation themselves, or nothing will happen. Even when they do support cancellation, they may still run to completion depending on how often they check for cancellation.

So, even after cancelling, task groups would still wait for completion.

3 Likes

Thanks for your comment. I am well aware of that. Therefore the code comment of the function I have posted says 'Both functions need to check for cancellation regularly.' If the tasks in a task group do not support cancellation, then it makes no difference if they are cancelled or not. However, if they support it, the things I have written are valid.

1 Like

Using async in this way isn't a future attraction: it's the status quo as soon as async is added to the language, and the compiler cannot even stop users from using it in this way if we wanted to. (This is because the compiler cannot distinguish between a function whose implementation currently doesn't have a suspension point but may in the future and is therefore genuinely declared as such, and one that's semantically constrained never to have a suspension point.)

Users will naturally reach for the simplest spelling they know to express the outcome they want, and any refinements to that should be spelled as such and not a repudiation of it.

The design of this feature should also dovetail with the related concept of reasync.

Do you anticipate a widespread need for user code to express “this code doesn’t suspend but may only be used in an async context”? Outside of the standard library, it seems pretty esoteric to me.

2 Likes

I'm not qualified to prognosticate, but we'd want the feature to be as ergonomic as possible even if--or actually, particularly if--it's going to be rarely used, since discoverability will be paramount given that, in part because of what you point out, doing the right thing will have to compete against just writing async as an attractive but non-ideal approach.

As I alluded to above, I have thoughts about the overall ergonomics of the structured concurrency feature which I will write up hopefully soon. The short form of this is that I hope we go through some iterations to make the overall experience absolutely as straightforward as possible; the underlying concepts are hard enough that we would want ideally not to be adding additional syntax burden to make it harder.

It also needs to be annotated for any functions that uses those library as well, not just those that provides the functionality. Just like async it could quickly propagate up the call chain.

Huh?

You can't call @instantaneous from normal non-async function. You can only call it from another @instantaneous or async, so now our functions have three colours.

I'm not sure how often @instantaneous will be subsumed by async, but I could definitely see the widespread of pure @instantaneous function to utilize TaskLocal APIs.

1 Like

That's why I'm saying that it should be spelled as a kind of async, since it's not a distinct "color" but rather exists only so that the function in question is called from an async context but itself does not suspend.

It'd be nice if we can put all non-async functions under the same main Task, though it gets tricky around traditional concurrency apis, like DispatchQueue.async.

I apologize to reply twice to the same post, but on thinking again, I do have to disagree with your characterization of what's async and what's synchronous:

As designed, the only guarantee provided by async, really, is that it can be called only from an async context. There's no guarantee at all that an async function has a suspension point, only that it potentially has one—in other words, async means that a function has zero or more suspension points, not one or more.

That much is already fundamentally a part of the already-approved await/async approval. Within the rules of this design, a function that can only be called from an async context but has no suspension points is properly async, not synchronous.

1 Like

Perhaps I’m missing something fundamental here:

Given a function which is guaranteed to return without suspension, what is the intended purpose of restricting it to only be called from an async context?

As far as I can tell, if a function is guaranteed to have exactly zero suspension points, always and forever, then there is no reason to restrict it to async contexts. Such a function could be made available in synchronous contexts as well, without any issues.

Creating a special syntax for async-but-never-suspending strikes me as a solution in search of a problem.

2 Likes

See the post above:

That does not answer my question. The post states that its author wants to make such a function.

I am asking why they (or anyone else) wants to restrict such a function to async contexts only.

As far as I can see, isCancelled should be readable from anywhere, including regular synchronous code. There is nothing intrinsically asynchronous about asking whether a particular task is currently, at-this-very-moment, cancelled.

Quite the opposite. Checking the current cancellation state seems, to me, inherently synchronous. I do not see any benefit to prohibiting synchronous code from doing so.

2 Likes

As I understand it, an arbitrary synchronous function may not even be executing as part of a task, so Task.isCancelled may not even make sense. (Or asking for a task-local value, or any other synchronous task-only API.)

Oh I see now, that’s a static function so that async code can check the status of the task it is part of.

I missed that before. Thanks.