SE-0300: Continuations for interfacing async tasks with synchronous code

Initially they were actually nested in Task, and we changed that some time ago... So either could be fine really... but it also does not really work with tasks per-se -- the fact that it is an async function rather means that it must be called from some existing task, and we resume it at some point.

So yeah, either top level or nested in Task, though I don't remember the specific reason we ended up prefering the top-level one rather the ones defined in Task :thinking:

We have discussed and pretty much designed this API with such integrations in mind :slight_smile:

NIO can simply do:

extension EventLoopFuture {
    func get() async throws -> Value {
        return await try withUnsafeThrowingContinuation { cont in
            self.whenComplete { result in
                switch result {
                case .success(let value):
                    cont.resume(returning: value)
                case .failure(let error):
                    cont.resume(throwing: error)
                }
            }
        }
    }
}

to provide an initial integration with async await and the existing ELF types. This can be done before NIO requires a Swift version that requires async await which is excellent; Once it decides to make a breaking change and adopt async in their APIs it could do so, but until then such extension can be provided by a compat module.

1 Like

The overhead is that the checked API needs to use a class to implement the "dropped without resumption" (this implies ARC traffic for counting that reference as well), as well as the resume calls needing to perform an atomic CAS whenever they're called: https://github.com/apple/swift/blob/main/stdlib/public/Concurrency/CheckedContinuation.swift#L63

The unsafe API is boils down to a raw pointer to the task object.

Since some continuation heavy APIs are likely to try to optimize away any un-necessary atomic operations we feel it reasonable to provide an API that does not do this (both in terms of storage and the operation itself).

// edit: more precise wording about the overhead

1 Like

That's because async/await turns the entire pattern "on its head" kind of.

Previously when calling an API you would pass also "call me (the completion handler) on this queue":

call(on: someQueue) { done in ... }

This is turned around 180' with async/await, as the "I need to run on a specific queue" instead is expressed by simply awaiting in a context that ensures you'll hop back to it -- i.e. an actor. So such API becomes:

actor class X {
  func x() async { 
    let done = await call() // we resume on the X's context, no need to pass in where we want to resume
  }

So in that sense, the pattern of "passing in" where one wants to be resumed is replaced by calling async functions from a context which ensures the resumption happens on that context (an actor). This is core to the async/await and actor runtime.

Given this information, I hope it is clearer why the continuations don't have anything do do with "on what executor" since the resume is independent of that. We only signal "okey, resume please" and it is up to the callers' context to determine where the resuption has to happen -- i.e. the actor, or "any executor (e.g. some global one)" if called e.g. from some non actor context.

Sure. But this particular proposal covers functionality that allows converting completion handler APIs to async APIs without having to rewrite everything using actors. So I’m asking how this proposal will interact with APIs that take a queue in addition to a completion handler. Will the tasks jump back to the calling queue or will they continue on the queue they were called on? Or, in general, how do the proposed APIs interact with queues, as they aren’t mentioned at all.

1 Like

They crash always, regardless if debug/release or any other configuration.

Yeah I was arguing for the same to be honest, so it's up to the core team to decide...

The motivation stated to only crash on the double-resume and not on the "forget-to-resume" was that the forget to resume would not provide a good diagnostic -- it is triggeded from deinit which can be any thread, and may be bard to locate what actually we forgot to resume. So double resume is definitely undefined behavior, but forgetting is "just" a memory leak... Somewhere I was arguing for at least offering some env variable or something to allow "please crash if not resumed" :thinking:

I–personally–totally agree though, I'd like both to crash hard when using the Checked API tbh... Some context on this is in Introduce a dedicated proposal for `with*Continuation` by jckarter · Pull Request #1244 · apple/swift-evolution · GitHub

In the pitch thread, I suggested the existing Result enum, so that we can use:

  • Result<T, Never> instead of separate non-throwing APIs.
  • resume(.success(value)) instead of resume(returning: value).
  • resume(.failure(error)) instead of resume(throwing: error).

Are the with*Continuation methods misnamed? The continuation is allowed to escape the given closure, unlike other with* methods in the standard library.

2 Likes

Could you elaborate on the non-actor context, or else comment on whether UI-based app development is likely to automatically funnel everything into a main-thread actor context?

This proposal looks great, but this point on resumption context outside actors seems pretty significant seeing as actors are currently still in the pitch phase.

Are we likely to start seeing a lot of boilerplate code along these lines in the near future?

let myResult = await getResult()
await DispatchQueue.asyncMain()

“UI stuff” means being on the “main actor” so it will always automatically resume on the main actor again.

No, there should not be any of such boilerplate as you suggest.

This proposal though is just the resuming — which is equal to calling an async function as to where the resumed code will run, so does not really play into this review.

1 Like

We should be clear that this would be done by queue-hopping. The point of the recommendation that async APIs provide a queue argument is to avoid queue-hopping. That is, if you want the completion handler to resume on a particular queue then it's better if the caller of that handler enqueues it directly onto that queue rather than dispatching to another queue and requiring the completion handler to dispatch again to get to the queue it really wants to be on.

What async/await does is provide a mechanism where continuations can be called on their desired context (whether it be a queue or by some other means), but this API doesn't allow you to interact with that. So that means when you call resume you will probably be on the wrong queue to start with, and the executor will have to dispatch (or do something else) to get back to the right context. That's exactly what the APIs that take a queue are trying to avoid.

I don't think there's a general solution to this problem because executors for async/await aren't required to use GCD in their implementations. However, I would guess that most implementations (at least on Apple platforms) will use GCD, which means it might be nice if the continuation API provided a way to get a GCD queue that it would prefer to have resume called on. That would at least allow for more efficient implementations that avoid queue hopping when possible.

Callers of resume would not be required to call resume on that queue, but doing so would possibly be more efficient. Meanwhile, executors that don't use queues could just provide an arbitrary queue.

This would of course also require that the API for executors themselves to have a method to get an optional queue to dispatch continuations on so that when the continuation is created it could ask the executor for the right queue.

Is all of that worth it for an occasionally performance boost that only some lower-level APIs would benefit from? Probably not, unfortunately...

1 Like

Could we make all usages of this API safe by throwing an implicitly continuation error instead of unsafe or checked API?

Hi @benrimmington for purposes of the review, can you provide some justification for why this is preferable?

@ktoso Even without actual API to control our queueing, could you summarize the queue hopping that may occur when using the proposed APIs?

For example, Alamofire has many response* APIs that takes a closure as well a DispatchQueue parameters, using .main by default. Before fully adopting async throughout the library, offering a compatibility API would be a good idea. Take this simplest API:

func response(queue: DispatchQueue = .main, completionHandler: @escaping (AFDataResponse<Data?>) -> Void) -> Self

I can pretty easily offer a async version using this proposal:

func response() async -> AFDataResponse<Data?> {
    return await withUnsafeContinuation { continuation in
        response { response in
            continuation.resume(returning: response)
        }
    }
}

The user can now call it like this:

let url = ...
let response = await AF.request(url).response()
response.map { ... }

If this code executes on a non-main queue, what happens? I would think that the response.map call happens on the same queue as the creating of the url, but it isn't clear. I think some mention of how DispatchQueues interact with this proposal is necessary.

I think there was a question previously in the pitch thread that I don't remember being addressed, asking why UnsafeContinuation<T> and UnsafeThrowingContinuation<T> (same for the checked variants) are separate types instead of introducing a single UnsafeContinuation<Result, Failure: Error> type?

This separation of types and making the error untyped is inconsistent with Task.Handle<Success, Failure: Error> in the structured concurrency pitch, and the already present Result<Success, Failure: Error> type.

I also don't see this addressed in the proposal itself. Is there a specific reason for this?

2 Likes

To clarify: this proposal is entirely agnostic to queues and performs no queue hopping. Absent anything else like actors being involved, whatever queue calls the resume method will continue being the current queue when the continuation is resumed. In all the examples shown, that means the queue on which the callback was called. Queue hopping is only introduced by actors, executors and the like; these are proposals that build upon the previous async/await proposal.

So in your example, again absent any use of actors, the awaited call to request would resume on whatever queue response used to execute the callback. Similarly, any APIs that uses customizable queues (either via a stored property or via an argument with the callback) would continue to do so and the continuation would resume on the same queue the callback does.

I’ll suggest to the proposal authors they add a note clarifying this to the proposal.

Not quite. This proposal is aimed at allowing calling of completion handlers from async functions. It is not directly related to actors, other than that actors rely heavily on async functions and this proposal helps with writing more async functions.

As such, this discussion of queue hopping, while helping clarify some things, isn’t really pertinent to this review. I’d ask further discussion of queues and queue hopping be taken to a separate thread to keep the review focused on the specific proposal. I realize this takes some level of suspension of concerns, because awaited functions resuming on a different and often unclear queue to the one on which they were called is unsatisfactory – as is completion handlers doing so. But that is what the actors and related proposals are intended to address, not this one.

7 Likes

Unless I'm misunderstanding something, this seems like a really bad user experience and a generally dangerous programming model. The entire value of async is that it simplifies both the interface and mental model asynchronous programming by making async code look synchronous. If that doesn't include a guarantee of calling context from one line to the next, the entire exercise is suddenly much less valuable and more dangerous. I mean, isn't one of the first things people are going to do is apply the APIs of this proposal to something like URLSessions completion handler APIs? Won't they be shocked when using the resulting async function that from one line to the next they've suddenly switched to the random background queue that URLSession uses for completion handlers? How are we supposed to use these APIs safely?

3 Likes

Nesting the APIs within Result is preferable, because the type parameters won't need to be inferred, so the Failure can be Never.

There could be an async Result.init, so the returning: and throwing: argument labels would make less sense.

There can be fewer APIs (i.e. a single Unsafe*Continuation type, with a single resume method).

extension Result {

  public struct UnsafeContinuation {

    public func resume(_ result: Result)
  }

  public init(
    _ operation: (UnsafeContinuation) -> Void
  ) async
}

extension Result where Failure == Never {

  public func get() -> Success
}

I'm not sure if your question was intended for me, but the answer might be found in §Alternatives considered: "being able to avoid the cost of checking when interacting with performance-sensitive APIs is valuable" …

Isn’t this the reason these low level APIs use “unsafe” in the name?

No.

The operation must follow one of the following invariants:

  • Either the resume function must only be called exactly-once on each execution path the operation may take (including any error handling paths), or else
  • the resume function must be called exactly at the end of the operation function's execution.

Unsafe*Continuation is an unsafe interface, so it is undefined behavior if these invariants are not followed by the operation. This allows continuations to be a low-overhead way of interfacing with synchronous code. Wrappers can provide checking for these invariants, and the library will provide one such wrapper, discussed below.

From what I understand (and to paraphrase what @Ben_Cohen said, continuation APIs here are more of a bread-and-butter language utility. Something as primitive as withoutActuallyEscaping(). Meanwhile, building on top of these primitives, the actor concurrency model is brought to live with e.g. the semantics around execution contexts you are calling for here.

It is like the way to define a synchronous function. It doesn't require execution context intrinsically. But it is flexible enough for you to enforce queue checks/switching in your method implementation, or require a callback queue as a function parameter.

As a concrete example, the equivalent of Kotlin Coroutines similarly does not require manual resumptions to be actively aware of these matters. You can call it anywhere, and the runtime internals use captured context to determine both the necessity of a dispatch, and the target dispatcher for the resumption.

So:

I mean, isn't one of the first things people are going to do is apply the APIs of this proposal to something like URLSession s completion handler APIs? Won't they be shocked when using the resulting async function that from one line to the next they've suddenly switched to the random background queue that URLSession uses for completion handlers? How are we supposed to use these APIs safely?

This should not happen. This is because the execution context of the suspending caller is fixed, and can so be captured for later use by the implementation details of the continuation to do a context switching when necessary.

I'm sorry, but that's contradicted by the paragraph I was replying to:

Absent anything else like actors being involved, whatever queue calls the resume method will continue being the current queue when the continuation is resumed.

That seems to say execution stays on whatever queue calls resume, so the user will see their code before and after the await call potentially executed on different queues. Like I said, I'm pretty sure this would be shocking to most users.

1 Like