Task Scheduling and Cancellation

StarLard · November 5, 2022, 1:02am

The Structured Concurrency proposal outlines that Tasks go through the following lifecycle: suspended -> running -> completed, where it may go through any number of alternating states between running and suspended at suspension points within the task. For example the following tasks lifecycle may look like:

Task { // (1)
     try Task.checkCancellation() // (2)
     try await asyncBar() // (3)
     // (4)
     foo()
}
// (5)

Task is suspended and schedulable; Task is ready to run and is waiting for the system to execute it
Task is running; Task is currently running on a thread
Task may be suspended or may stay running, depending on the implementation of asyncMethod(). From proposal "when an asynchronous function calls another asynchronous function, we say that the calling function is suspended, but that doesn’t mean the entire task is suspended. From the perspective of the function, it is suspended, waiting for the call to return. From the perspective of the task, it may have continued running in the callee, or it may have been suspended in order to, say, change to a different execution context.". For the sake of example, let's say the task is suspended.
Task is running again
Task is completed

To prevent asyncBar() and foo() from being called after our task is cancelled, you must check for cancellation on line 2, because our task may have been cancelled before ever beginning to execute. If our asyncMethod() in the example cooperatively supports cancellation, then cancellations that occur while the task is running the method will be propagated to the task.

However; what happens to cancellations that occur between asyncMethod() returning and the task moving back from the suspended state to the running state? To fully prevent any chance of foo() running, Is it necessary to check for cancellation again after returning from asyncMethod to account for cancellations that may occur during this time where the task is schedulable? For example, in order to fully cover our bases must we do:

Task {
     try Task.checkCancellation()
     try await asyncBar()
     try Task.checkCancellation()
     foo()
}

While this hypothetical window is likely only fractions of a second, does such a window exist and is the second cancellation check necessary?

MPLewis · November 5, 2022, 2:08am

From the perspective of this task, yes - if you wanted to completely ensure that no future child call is executed on cancellation, you need to check between each of them as the task may be marked cancelled at any time, including between the child call returning and the next statement executing. Additionally, who knows how the implementation of asyncMethod may change over time with respect to cancellation checking, so putting the control flow in the hands of the task body is (to me) just a good style and future-proofing choice.

However, in my view, this fine-grained cancellation checking is rarely necessary. I generally only check cancellation:

Before operations with important side-effects (writing to a file/database/shared state, etc.)
Before operations that are "long-running" (with the exact definition incredibly dependent on your particular app's tolerance for the time spent waiting on running tasks when exiting a scope)
Between significant checkpoints in a task's logic (more of a style choice, just to have a somewhat regular interval of checking if I have a particularly complicated task)

Other than side effects, the worst that happens if you don't check for cancellation at all is that your task simply runs to completion even after it's been marked cancelled, and delays the exiting of the scope it was defined in (since it will be implicitly awaited at that point, cancelled or not).

I'll use a concrete example here of running a backend application using an AWS Lambda: if I had two parallel tasks and one of them is awaited on and throws an error, that error will already be the thing reflected in the output to the caller even though the second task still needs to be waited on before that. By checking for cancellation more frequently in the other task, I could speed up the response to the caller, but at the expense of more code clutter - thus, it really becomes about how important it is in the context of your application to actually cancel the task instead of letting it run for longer than it might need to.

StarLard · November 7, 2022, 6:42pm

Thanks so much for clarifying this behavior! Makes total sense. In addition to the ones you listed, my largest concern with regards to cancellation is artificially extending the lifetime of strongly captured objects in the task. Since Tasks implicitly capture, it's very easy to accidentally create a scenario where a task which captures an object such as a view controller from which it runs and is cancelled somewhere like viewDidDisappear(_:). In such a scenario, failure to correctly handle cancellation may artificially keep the captured object alive for quite awhile depending on how much additional work the task does.

MPLewis · November 7, 2022, 7:47pm

Yep, that would be another instance that it might make sense to check more frequently for cancellation.

Additionally, depending on your exact usage, it might also make sense to make use of weak capture lists (see "Weak capturing" section in this article for an example) so that the task never even (strongly) captures the value in the first place. You'd have to implement handling for when the reference goes nil, but that should be less code clutter than repeated cancellation checking.