[Pitch] Task Cancellation Shields

Good day everyone,

Making use of the week where development branches are locked, I’d like to pre-flight a pitch which is a follow-up to the async deinit we’ve done recently as part of SE-0493: Support async calls in defer bodies: task cancellation shields.

In that proposal’s review there was a discussion about “shielding” pieces of code from observing the effects of task cancellation. This proposal offers an API to do that.

Task cancellation shields allow a piece of code to run while ignoring the effects of cancellation. It does not prevent a task from being cancelled, however it does prevent the task from “observing” the cancellation (either by using isCancelled or cancellation handlers).

Please refer to the full proposal text over here: SE-NNNN: Task Cancellation Shields

Please submit any typos or editorial fixes to the proposal pull request and let’s keep this thread focused on functionality and semantics.

Thanks!

20 Likes

+1, this is one of the core requirements to build Structured Concurrency code. Today we rely on the Task.value { try await cleanup }.value which is bad for the reason stated the proposal.

@ktoso just to double-check. The task hierarchy inside of a cancellation shield can also be cancelled, right?

A very common pattern is

defer {
   // might be cancelled from surrounding tasks
   try await withTaskCancellationShield {
      // deffo NOT cancelled when we get here
      try await withTimeout(.seconds(10)) {
           // might be cancelled (e.g. if the timeout fires)
           try await decommissionResource() // often uses HTTP or other stuff that can time out
      }
   }
}
8 Likes

Strong +1.

As mentioned in the proposal, this is both very important so that async code that might react to cancellation can be used for cleanup and the runtime can implement this more cleanly and efficiently than manual workarounds.

I am also very glad that the proposal includes a way to test if a cancellation shield is currently active and to test for cancellation ignoring a cancellation shield.

1 Like

You’d have to show the implementation of with timeout to answer that, if it has a child task to run that block if code then yes, if no, then as-proposed that cannot observe cancelation because “the current task is shielded”.

Since this is an imaginary with timeout, I’d rather spell out what you actually mean there.

1 Like

+1 on the proposal. As @johannesweiss mentioned this is necessary to write structured concurrency code when managing resources. As it stands today, using an unstructured task for blocking the cancellation of a clean-up is the only workaround and it comes with a lot of downsides.

3 Likes

just to clarify – is this meant to read '...is a follow-up to the async defer...', or is there something about async deinit that was involved in the async defer proposal or its implementation that is relevant here?

1 Like

It would have a child task that it cancels when the timeout has been reached. Is there another way you could implement withTimeout?

1 Like

First thing I looked for was whether we could check for the shield, which you added two clean APIs for! :slight_smile: They seemed ideal.

My only concern is its yet another piece of cognitive overhead for Tasks, but frankly, I don't have any better proposals, and the current lack of this functionality adds its own equal-or-worse overhead to reason through and work around.

Should this be

guard !Task.isCancelled else {

In this code from the proposal:

extension SomeSystem { 
  func performAction(_ action: some SomeAction) { 
    guard Task.isCancelled else {
      // oh no! 
      // If Resource.cleanup calls this while being in a cancelled task,
      // the action would never be performed!
      return 
    }
    // ... 
  }
}

Thanks for spelling this out, I think it’s important to be very specific when we discuss semantics here.

Yeah, child tasks are able to be cancelled independently. It’s kinda implied in the proposal but I’ll add an explicit example of this.

If it were implemented by the runtime we could imagine other ways; so I want to be very clear which one you were asking about, just to make sure we’re discussing a child task there.

Yes, you can submit a fix to the PR or I’ll get to it at some point – thank you

Can this concept be generalized to other task state such as task locals? (Should it be generalized?)

Cancellation is something that generally comes from outside the task, which is why the task may usefully want to shield some of its operations from it. Priority escalation is in the same bucket, except I can’t think of any reason a task would want to except itself from that. All of the other task state is something the task itself asks for; if the task wants a task local to be different within a scope, it should just change it normally.

1 Like

what's supposed to happen in a case like this:

await withTaskCancellationShield {
  do {
    async let _ = {
      var count = 0
      while true {
        count += 1; print("count: ", count)
        if Task.isCancelled { break }
        try? await Task.sleep(for: .seconds(1))
      }
    }()
  } // implicit child task cancelled here... is it observable?
}

edit: and another case that i think isn't directly mentioned in the current proposal text – what's the impact on Task.checkCancellation()? i would assume it just becomes a noop if under an active 'shield'?

2 Likes

This is an “explicit cancellation” for what it’s worth, so yes this task is cancelled and would not hang.

Shields prevent cancellation from “passing through” the shield if you will, so if the outside task was cancelled, the inner child task would not be. This however is the inner task being explicitly cancelled; this isn’t the same as cancellation propagating from paren task to child task.

It’s a good example worth adding to the proposal so that people can be clear about the intended behavior, thanks for adding this to the discussion.

7 Likes

in any place in the proposal effects described to isCancelled are the same to checkCancellation because all check does is “isCancelled → throw”.

I’ll add a paragraph to be clearer about this.

2 Likes

perhaps things have changed since it was written, but the description of this scenario as outlined in the async let evolution proposal is characterized as an 'implicit' cancel (and i guess that's how i've always thought of the behavior).

thanks, this was a very helpful clarification. IIUC then, this feature is purely about influencing automatic propagation of the cancelled state through the Task tree. i suppose my mental model of structured concurrency has been that the cancelled state for the entire tree is the same, so there's a singular 'true' value, and this feature would simply hide that value from certain subtrees in the graph. clearly that model isn't quite right due to how task groups can be independently cancelled. so perhaps a more accurate model is that each 'node' of the tree has an independent cancelled bit, and the default behavior just propagates the value from ancestors to children in a 'natural' way. this feature gives a means of 'opting out' of that propagation (apologies if this is obvious to everyone else, mostly writing it out for myself...).

so... at the risk of being redundant, going back to an async let example; if we had code like this:

let task = Task {
  await withTaskCancellationShield {
    async let whatever = {
      await withTaskCancellationHandler { /* wait for a while */ } onCancel: { ... }
    }()
    await whatever
  }
}
task.cancel()

the async let child will not observe the parent task cancellation (it's not propagated due to the shield), but if we didn't explicitly await it, it would observe its own cancellation due to exiting scope without being awaited. is that right?


edit: actually, maybe i still don't quite get it. the proposal currently says:

They do not prevent a task from being cancelled, however, they affect the observation of the cancelled status while executing in a "shielded" piece of code.

in the case of child tasks... is it true that a child is still 'really cancelled' if the parent has been but the child was formed under a 'shield'? i guess i feel i'm floundering a bit with how to properly think about what 'isCancelled' actually means now if it's not 'this region of the Task tree is in the cancelled state'.

Would it not be the case that async let being implicitly cancelled at the end of scope mean that there’s no code that could be observing the cancellation? Can implicit cancellation ever be observed?

You’re chicken-and-egging, I think. The async let task can observe that it’s been cancelled, starting when the parent task begins to leave the scope of the async let. The parent task can’t finish leaving the scope until the async let task completes, but that doesn’t mean the task has to have completed already.

1 Like

Would this be made more clear with another name for the API, such as withoutPropagatingTaskCancellation? It seems like there's a lot of struggle with this notion of what a "shield" does, which doesn't seem to need its own term of art here.

2 Likes