[Pitch] Structured Task Cancellation Tokens

mi11ion · June 8, 2025, 2:42pm

Hello

This is a pitch addressing a significant limitation in Swift's concurrency system: the current binary cancellation model provides no context about why a task was cancelled, making it impossible to implement different cleanup strategies or debug cancellation flows in production systems

My proposal introduces CancellationToken - a lightweight type that provides structured cancellation with explicit reasons, enabling tasks to make informed decisions during cancellation without breaking existing code

You can view the full proposal here: Structured Task Cancellation Tokens

The design integrates cleanly with Swift's existing concurrency primitives through @TaskLocal propagation and maintains full backward compatibility. I'm seeking community feedback before moving forward with implementation

I'm particularly interested in thoughts on:

i. The token propagation semantics through task hierarchies
ii. Use cases I might have missed where cancellation context would be valuable

Looking forward to your questions and feedback!

allenh · June 8, 2025, 6:19pm

Could CancellationToken be a protocol to allow application specific extension rather than the stringly typed custom(String). The stdlib would then have its own concrete SwiftCancellationToken. I suppose either way, reason discovery will be difficult. If for instance Foundation wants to have some additional reasons, or maybe platform frameworks like AVFoundation.

The protocol would require you to first cast to a reason “domain”, but leaves a lot more room for the library to provide strong type information about what reasons or other details might be available.

mi11ion · June 8, 2025, 6:41pm

That's a really interesting idea about making it a protocol. I actually spent quite a bit of time exploring that approach since extensibility is definitely vital, especially for frameworks that have their own domain-specific cancellation reasons

The wall I kept hitting was with @TaskLocal propagation. With a protocol, you end up with @TaskLocal static var current: (any CancellationToken)? which introduces existential overhead and makes the type system fight you at every turn. The whole design really depends on tokens flowing seamlessly through task hierarchies.

There's also the practical issue that most real-world code needs to coordinate cancellation across multiple frameworks. Like when you're downloading with URLSession, processing with AVFoundation, and uploading to CloudKit - having three different token types to juggle becomes real pain. A single concrete type just works

I think the custom(String) case gives us most of what we need for extensibility. Frameworks can still add their strongly-typed reasons through extensions:

extension CancellationToken.Reason {
    static let videoDeviceLost = Self.custom("AVFoundation.videoDeviceLost")
}

It's basically following the same pattern as Error/NSError, pragmatic but effective. Plus, starting with a concrete type keeps our options open. We could always add protocol conformance later if we find we really need it

Thanks for the feedback, but I think my current idea hits the sweet spot between type safety and usability

allenh · June 8, 2025, 7:13pm

It might be worth adding a blurb about the protocol approach in “alternatives considered”. As well as specifically calling out an example of extending Reason like you did here!

nikola · June 8, 2025, 8:08pm

Really like the looks of this! I do see a potential small ergonomic improvement here: swap the argument labels for the Task initializers that take a CancellationToken (and the addTask one). As an arbitrary example:

extension Task where Failure == Error {
    /// Create task with cancellation token
    public init(
        priority: TaskPriority? = nil,
        token cancellationToken: CancellationToken, // <--- switched labels here
        operation: @Sendable @escaping () async throws -> Success
    ) {
        self.init(priority: priority) {
            try await CancellationContext.$current.withValue(cancellationToken) {
                try await operation()
            }
        }
    }
}

// Call-site is less verbose:

let token = CancellationToken()
let task = Task(token: token) {}

Cancellation is already encoded in the type name of CancellationToken, so moving the token label in front doesn’t really sacrifice any clarity.

mi11ion · June 8, 2025, 8:14pm

thanks for catching this, Task(token: token) reads far better than the redundant labeling. I'll revise the proposal accordingly, thank you!

benlings · June 9, 2025, 7:28am

Should the cancellation reason be an Error, instead of following the pattern of Error?

This would be consistent with how other languages manage cancellation reasons. E.g Go’s equivalent to task locals is Context. This uses the err interface to communicate the cancellation ’cause’ to child functions: context package - context - Go Packages

mi11ion · June 9, 2025, 8:34am

I considered using Error but believe it conflates distinct concepts. Cancellation is a coordination mechanism, not a failure. User cancellation and network errors require fundamentally different handling

Go has the error interface for cancellation (context.Canceled, DeadlineExceeded, WithCancelCause...), and it works within Go's constraints, but Swift's enums let us preserve richer cancellation metadata without losing type safety

I'd rather keep cancellation reasons separate, it's clearer what's happening and fits with Swift's philosophy of letting the type system do the heavy lifting for these distinctions

Joe_Groff · June 9, 2025, 4:50pm

We should be able to do a bit better within the context of structured concurrency. Trio for instance uses the concept of cancellation scopes, which run the code inside a scope under certain cancellation conditions, such as a timeout:

This avoids the need to explicitly create and pass around cancellation tokens, since the cancellation state is part of the ambient environment. The design you've proposed using task-local storage to track the cancellation token can do this too; however, we could introduce a variety of cancellation-scope-introducing APIs that each have their own token and reason value(s) to propagate to users. If each scope provides its own token, that could address the concerns about extensibility in this thread, without requiring all tokens and reasons to conform to some centralized protocol.

mi11ion · June 10, 2025, 7:17am

Yeah, scoped cancellation looks much more ergonomic

await withTimeout(.seconds(30)) {
    // Implicit .timeout reason
    try await process() 
}

But I'm wondering how multiple nested scopes would compose. Would inner tasks see a stack of active cancellation reasons, or just the most specific one? I'm imagining something like:

await withTimeout(.seconds(30)) {
    await withResourceLimit(.memory(1GB)) {
        // What does Task.cancellationReason return here??
    }
}

Joe_Groff · June 10, 2025, 7:15pm

My thinking was that each scoped cancellation operator could provide its own mini-token, which could be used to record cancellation reasons specific to that cancellation mechanism:

await withTimeout(.seconds(30)) { timeout in
    await withResourceLimit(.memory(1GB)) { resourceLimit in
        try await process() 

        switch (timeout.reason, resourceLimit.reason) {
        case (let timeout?, _):
          // timed out
        case (nil, let resourceLimit?):
          // resource limit
        }
    }
}

That way, each cancellation mechanism can provide its own tailored set of cancellation reasons without a central coordination mechanism.

benlings · June 11, 2025, 7:51am

If multiple nested scopes are an issue, would using variadic generics be an option?

await withCancellationReasons(.timeout(.seconds(30)),  .resourceLimit(.memory(1GB))) { timeout, resourceLimit in
      try await process() 

      switch (timeout.reason, resourceLimit.reason) {
      case (let timeout?, _):
        // timed out
      case (nil, let resourceLimit?):
        // resource limit
      }
}

mi11ion · June 11, 2025, 9:22am

Thanks for the feedback! The variadic approach could be a nice addition for common combinations, but making it the primary API would force homogenization of different cancellation types. Better to start with the flexible nested approach and potentially add variadic conveniences later where they make sense. I believe nesting is the right primary approach as it naturally follows Swift's existing patterns (withUnsafePointer, withTaskGroup) and allows each scope to provide specialized APIs, a timeout scope might expose .remaining while a resource limit exposes .currentUsage. The composition is also clearer:

await withTimeout(.seconds(30)) { timeout in
    await withResourceLimit(.memory(1GB)) { resourceLimit in
        // Natural precedence, each scope can short-circuit
        // or transform the cancellation flow
    }
}

This also maintains the structured concurrency philosophy, each scope creates a clear boundary with its own cancellation semantics, just like task groups create boundaries for child tasks

mi11ion · June 15, 2025, 7:28pm

I've revised the proposal to use structured cancellation scopes instead of explicit tokens. The new design makes cancellation context ambient through the task hierarchy via withTimeout, withResourceLimit, etc. Each scope type defines its own reasons, eliminating the need for central coordination, and frameworks can easily add domain-specific scopes

Really appreciate the insights, especially from Joe Groff about cancellation scopes, the revised proposal is much stronger for it

updated proposal

Zollerboy1 · June 17, 2025, 5:45pm

With variadic generics as suggested by @benlings (officially called parameter packs in Swift) you could have different cancellation types in one call.

mlemoine · June 17, 2025, 5:55pm

This is a great proposal and intregrates very nicely with current Swift Concurrency.

One example is somehow troubling me in the proposal though. I feel I must be missing something obvious:

let task = Task {
    await withTimeout(.seconds(30)) { timeout in
        // Both cancellations work together:
        // If timeout expires: timeout.isCancelled = true, reason = .deadline
        // If task.cancel() called: timeout.isCancelled = true, reason = .deadline
        // Task.isCancelled and timeout.isCancelled remain coherent
    }
}

// External cancellation flows through the context
task.cancel()

If task.cancel() is called, I don't see why we should have timeout.reason = .deadline or timeout.isCancelled == true?

Looking at [Pitch] Structured Task Cancellation Tokens - #12 by benlings. I'd expect timeout.reason to be nil for an external cancellation. And this makes much more sense to me.

In summary, I would rather expect:

Task.isCancelled is set to true if any CancellableContext is triggered or if there was an external cancellation.
If .isCancelled is true, then .reason on that same object should not be nil. (Actually, CancellableContext.isCancelled could simply be a computed property for reason != nil)

mi11ion · June 17, 2025, 8:03pm

while parameter packs could technically work here, they'd push heterogeneous type juggling onto every call site. The beauty of nested scopes is they naturally encode precedence and allow each scope to short-circuit independently, just like how withTaskGroup doesn't try to bundle every possible configuration into one call

mi11ion · June 17, 2025, 8:11pm

Yeahhh that example muddled the semantics. The timeout context should only report .deadline when it actually fires, not when external cancellation flows through it

Was trying too hard to unify the cancellation story that ended up erasing distinctions😑. Will fix this to preserve proper causality, thanks

upd: fixed