Structured Concurrency (async let) - Cancellation

somu · June 30, 2022, 6:14am

Hi,

Overview:

I have a doubt regarding async let cancellation propagation.

In Explore structured concurrency in Swift - WWDC21 - Videos - Apple Developer (8:32 min) they talk about Task tree and cancellation propagation in structured tasks.

In the following example, I have 3 async functions:

compute calls:
- computeA
- computeB

computeB throws an error

lets name the tasks as follows:

parentTask runs compute
taskA runs computeA
taskB runs computeB

Scenarios:

If computeA is awaited first and next computeB is awaited, then computeA is not cancelled though the error was thrown already by computeB. (why?)
If computeB is awaited first and next computeA is awaited, then computeA is marked as cancelled (this works as I expected)

Questions:

My question / doubt is based on scenario 1.

Why is computeA not cancelled when computeA is awaited first?
When taskB would throw an error, would parentTask be the one to inform taskB that it should be cancelled? Is that the reason computeB would have to be awaited first in order to inform computeA about the cancellation?

Code:

enum SumError: Error {
    case dummy
}

func computeA() -> Int {
    var sum = 0
    for index in 1...10_000_000 {
        sum += index
        if Task.isCancelled {
            print("computeA was cancelled at index \(index)")
            sum = 0
            break
        }
    }
    print("computeA completed")
    return sum
}

func computeB() throws -> Int {
    var sum = 0
    for index in 1...10_000_000 {
        sum += index
        if sum > 500000000 {
            print("computeB going to throw")
            throw SumError.dummy
        }
    }
    return sum
}

func compute() async throws {
    
    async let a = computeA()
    async let b = computeB()

    print("parent thread is not blocked")
    
    //Scenario 1 (doesn't cancel computeA)
    print("a = \(await a)")
    print("b = \(try await b)")

    //Scenario 2 (works as expected)
    //print("b = \(try await b)")
    //print("a = \(await a)")
}

Task {
    print("-----------")
    do {
        try await compute()
    } catch {
        print("error: \(error)")
    }
    print("-----------")
}

RunLoop.main.run()

Output

Scenario 1:

-----------
parent thread is not blocked
computeB going to throw
computeA completed
a = 50000005000000
error: dummy
-----------

Scenario 2:

-----------
parent thread is not blocked
computeB going to throw
computeA was cancelled at index 30970
computeA completed
error: dummy
-----------

Environment:

Xcode command line project

ibex10 · June 30, 2022, 6:53am

The code spawns off only one task. Where is the other task?

Orup70 · June 30, 2022, 7:05am

If I understand correctly, a child task initiated with async let returns its result when you await the variable later in the function. This applies even if the result of the child task was an Error being thrown.

In your case the compute function can only be informed about the Error at the try await b suspension point. The Error thrown from computeB is “saved” by the async let b child task until you request the result.

It’s only when you call try await you can get an Error and at that point the other child tasks will be cancelled. This is why, in your first scenario, the computeA task will run until completion without being cancelled, since computeB has not been asked for its result yet. In the next line try await b will throw an error and cancel any potentially remaining child tasks. But the computeA child task has already finished at this point.

Orup70 · June 30, 2022, 7:07am

Each async let spawns off a new child task.

ibex10 · June 30, 2022, 7:19am

Thank you!

somu · June 30, 2022, 7:20am

@Orup70 thanks a lot for the detailed explanation.

That explains why computeA wasn’t informed of the cancellation in the scenario 1

ibex10 · June 30, 2022, 7:44am

This is going to sound dumb, but what is cancelling the computeA task?

The cancellation seems to occur randomly.

-----------
parent thread is not blocked
--> computeA() <NSThread: 0x107204f70>{number = 2, name = (null)}
--> computeB() <NSThread: 0x1007123c0>{number = 3, name = (null)}
computeB going to throw
computeA was cancelled at index 31370
computeA completed
error: dummy
-----------
...
-----------
parent thread is not blocked
--> computeA() <NSThread: 0x107104f70>{number = 2, name = (null)}
--> computeB() <NSThread: 0x1072040f0>{number = 3, name = (null)}
computeB going to throw
computeA was cancelled at index 33794
computeA completed
error: dummy
-----------

ibex10 · June 30, 2022, 8:07am

Answering my own question here for the benefit of learners

computeA task is automatically being cancelled by the run-time system when computeB task throws the error. I am quite surprised by this phenomenon

Dante-Broggi · June 30, 2022, 8:21am

When the parent task prepares to return (due to rethrowing the error from task B), it cancels task A, and then awaits it, then finishes returning (rethrowing the error).
This ensures that there are no child tasks (async lets) created in the current scope which remain active after the function returns.

Orup70 · June 30, 2022, 9:00am

Just to clarify, the runtime is not cancelling the computeA task exactly when the computeB task throws the error, but when the parent task returns from the try await b expression and needs to return from the compute function by rethrowing the error.

It’s not completely obvious at first, but it’s the foundation of the structured concurrency model. The task model is cooperative and it’s guaranteed that a task will not end before any of its child tasks have completed first. Even if a child task is cancelled it will not end if it doesn’t check for cancellation.

In your code the computeA and computeB tasks are running concurrently and the runtime can only interfere at the suspension points of await a and try await b plus the implicit suspension point when the parent task is returning from the compute function. When returning from the function all child tasks must be awaited before the task can return.

Orup70 · June 30, 2022, 9:36am

As Dante-broggi explained better:

When I try to understand code like your example, I imagine the async let a and async let b as spawning off two independent child tasks with the variables a and b acting as task handles.

If taskB is throwing an error this will not affect any code until I try to get the result from the task handle in form of the variable b.

“You can only get an error if you try“

It’s also a very good argument for having the try marker in normal (synchronous) code – even if it makes the code harder to read. Only lines with a try marker can throw. And when it comes to asynchronous code only at the specific suspension points you may get an error thrown. Asynchronously throwing errors would be a nightmare to handle or understand.

It’s also an argument for why the async let b requires the try marker when the result of the task is retrieved with try await b and not when the task is spawned at the async let b = … statement. Because the error is received at the try await b suspension point and not when the task is spawned, nor asynchronously when the error is thrown.

somu · June 30, 2022, 10:00am

@Orup70 @Dante-Broggi

Thanks a lot for the explanations, I am beginning to understand better.

Also I decided test scenario 3 which scenario 2 + catching the error inside compute as follows:

Scenario 3

func compute() async throws {
    
    async let a = computeA()
    async let b = computeB()
    
    print("parent thread is not blocked")
    
//  Scenario 1 (doesn't cancel computeA)
//    print("a = \(await a)")
//    print("b = \(try await b)")

//  Scenario 3 (works as expected)
    do {
        print("b = \(try await b)")
    } catch {
        print("compute caught error: \(error)")
    }
    print("a = \(await a)")
}

Output:

-----------
parent thread is not blocked
computeB going to throw
compute caught error: dummy
computeA completed
a = 50000005000000
-----------

So this cancellation of child task happens only when the parent task needs to end (return / throwing error) and it has pending active child tasks.

Even a return statement doesn't return till all child tasks have ended.

Thanks a lot guys for the wonderful explanation

ibex10 · June 30, 2022, 11:47am

Thank you.