What's the expected behavior of `[Throwing]TaskGroup.next()` on cancellation?

I was searching through the docs, wwdc session and the original proposals, but I couldn't find an explicit statement what the expected behavior for next() method is when the task group parent task gets cancelled.

Some of the related implementation looks like this:

public mutating func next() async -> ChildTaskResult? {
  return try! await _taskGroupWaitNext(group: _group)
}

/// Await all of the remaining tasks on this group.
@usableFromInline
internal mutating func awaitAllRemainingTasks() async {
  while let _ = await next() {}
}
  • How does _taskGroupWaitNext(group:) actually behave?
  • A) Will it nil out immediately when the group is cancelled?
  • B) Will the cancellation be forwarded to all sub-tasks and awaitAllRemainingTasks will still await all tasks to finish / complete in the appropriate way, hence draining all sub-tasks before exiting from with[throwing]TaskGroup function?

I personally hoping that the expected behavior is (B) and not (A). I want the group to wait until everything is cancelled AND finished / completed before resuming from with[throwing]TaskGroup function.

The only clue that I currently have is that there is a postcondition after with[throwing]TaskGroup function resumes, that the group will be empty. However this does not 100% confirm that internally this does look somewhat like this:

task.cancel()
await task value

However the sub-task could be cancelled and its result could be discarded.


To further demonstrate what I'm looking for here's an example of a pseudo Void task group which never fails:

let task = Task<Void, Never> {
  let subTask1 = Task<Void, Never> { ... }
  let subTask2 = Task<Void, Never> { ... }

  await withTaskCancellationHandler {
    // the order is not important 
    await subTask1.value
    await subTask2.value
  } onCancel: {
    subTask1.cancel()
    subTask2.cancel()
  }
}

// cancel in the future
task.cancel()

Hmm it looks like the answer is still (A), which is somewhat unfortunate. :disappointed:

/// Cancel all of the remaining tasks in the group.
///
/// After cancellation,
/// any new results from the tasks in this group
/// are silently discarded.
public func cancelAll()

A quick experiment suggests that B is what happens. Consider this code:

func child() async {
    await withThrowingTaskGroup(of: Void.self) { group in
        group.addTask {
            print("Running first")

            // This task will block forever.
            let _: Void = await withUnsafeContinuation() { _  in }
        }

        group.addTask {
            print("Running second")

            // This should throw.
            do {
                try await Task.sleep(nanoseconds: 1_000_000_000_000)
            } catch {
                print("Task 2 threw: \(error)")
                throw error
            }
        }

        group.addTask {
            // This returns immediately.
            print("Running third")
            return
        }

        errorHandlingLoop: while true {
            do {
                while let _ = try await group.next() {
                    print("Got result in parent")
                }
                break errorHandlingLoop
            } catch {
                print("Threw in parent \(error)")
                continue errorHandlingLoop
            }
        }

        print("Parent loop stopped")
    }
    print("Exited task group")
}

func main() {
    Task {
        let t = Task { try await child() }
        print("Started child task, sleeping")
        try await Task.sleep(nanoseconds: 10_000_000_000)
        print("Cancelling child task")
        t.cancel()
        try await Task.sleep(nanoseconds: 10_000_000_000)
        print("Sleep finished in parent")
    }
}

This prints the following output on my machine:

Started child task, sleeping
Running first
Running second
Running third
Got result in parent
Cancelling child task
Task 2 threw: CancellationError()
Sleep finished in parent

Note that we get two results from the loop, not three. The child task we wedged causes next() to hang indefinitely.

More importantly, it is a vital invariant of TaskGroups that when their scope exits all the child tasks must have completed, either by erroring or returning a value. Whether the value is handled in the parent is less important, but the child tasks must not outlive the parent scope. Any failure to achieve that is IMO a bug in Swift.

2 Likes

Oh, that's an interesting example. I need to play around with this a bit more. The official documentation lacks proper explanation here, especially because usually AsyncSequences do terminate through a nil on next(), but we kinda don't want that here and the cancelAll() also seems to suggest somethingmisleading.

TaskGroup.cancelAll() is important when you write a task group that only needs some of its child tasks to finish in order to deliver a result. You should call cancelAll() before returning the result to tell the outstanding child tasks to cancel.

Examples of such task groups:

  • Running an async function with a timeout
  • A race function that races two async functions and returns the result of the one that finishes first (see example below)

If you forget to call TaskGroup.cancelAll(), the outstanding child tasks will run to completion even though their result will be ignored. It’s very easy to forget because forgetting the call will not change the behavior of your program, only its performance, making the bug hard to find.

@s-k brought this up during the Structured Concurrency review: SE-0304 (3rd review): Structured Concurrency - #5 by s-k

Note that TaskGroup does cancel outstanding child tasks if it exits abnormally, i.e. by throwing.

Another example where the cancelAll() call is missing is actually in SE-0317 async let:

func race(left: () async -> Int, right: () async -> Int) async -> Int {
  await withTaskGroup(of: Int.self) { group in 
    group.async { left() }
    group.async { right() }

    return await group.next()! // !-safe, there is at-least one result to collect
  }
}

This function will always wait for both child tasks to run to completion, even though it only needs the result of the first one to the finish line.

That's the thing. The documentation on the cancellation behavior is somewhat misleading or incomplete. Even you just said, that the result of the child tasks will be ignored. Why though? If you cancel all the tasks, they don't have to throw, they can still complete normally and I expect their result to be passed through the async sequence iterator instead of next() to return nil.

If we look at other AsyncSequence types, it seems like they track if the parent task gets cancelled, then they stop emitting values and complete by returning nil or in other words the sequence also cancels itself.

TaskGroup seems to function different. If the parent task cancels, all sub-tasks will be notified to cancel, then they can still possibly return some kind of value and those values are still consumed through the async sequence the TaskGroup creates.

This is the exact behavior that I was personally looking for. I want to spawn some child tasks (with Void as their return type) and on cancellation of the parent task those child tasks should cancel and complete in any order before the withTaskGroup function itself resumes.