[Pitch #2] Structured Concurrency

s-k · January 3, 2021, 8:37am

I probably haven't made my point clear enough. So I will elaborate:

I can very well imagine this function implementation in the wild using the current state of the proposal (a similar function was posted by @anandabits in the original pitch thread):

/// Runs two functions concurrently and returns the result of the first to complete.
/// Both functions need to check for cancellation regularly.
func race<R>(method1: () async -> R, method1: () async -> R) async throws -> R {
    try await Task.withGroup(resultType: R.self) { group in
        await group.add { method1() }
        await group.add { method2() }
        
        let firstResult = try await group.next()! // 1
        return firstResult // 2
    }
}

With the current state of the proposal, this implementation is broken. It will always wait for both methods to finish before returning. And this bug is very hard to find. A correct implementation will need to add group.cancelAll() between lines 1 and 2.

An expert on structured concurrency will maybe spot this. However, I argue that it is easy to make this mistake for a normal programmer. (This exact mistake has been made or overlooked by at least two people in the original pitch thread.) My philosophy is to design APIs in a way to make mistakes hard to make. Humans will always make mistakes.

One could tackle this problem by changing the proposal so that remaining tasks are always cancelled when the body of withGroup() returns. However, some people will probably assume that all tasks started in a task group are completed unless explicitly cancelled. This would lead to race conditions.

@ktoso suggested the following solution:

My suggestion is that the body of withGroup() needs to return an enum:

extension Task.Group {
    enum ReturnBehavior<R> {
        case afterAwaitingRemainingTasks(R)
        case afterCancellingRemainingTasks(R)
    }
}

With this, the example would read:

/// Runs two functions concurrently and returns the result of the first to complete.
/// Both functions need to check for cancellation regularly.
func race<R>(method1: () async -> R, method1: () async -> R) async throws -> R {
    try await Task.withGroup(resultType: R.self) { group in
        await group.add { method1() }
        await group.add { method2() }
        
        let firstResult = try await group.next()!
        return .afterCancellingRemainingTasks(firstResult)
    }
}

Everybody would know exactly how the remaining tasks are handled. In addition, the fact that the return call is the last line of the withGroup() call helps the mental model that the cancelling or awaiting takes place after the body has run and before withGroup() returns.

A disadvantage of this solution is that it is pretty verbose (maybe a shorter spelling can be found). However, Swift has a history of preferring explicitness over terseness. In addition, withGroup() will probably not be used directly in many code bases. Instead, I predict that most people will use convenience functions that use withGroup() internally.