[Concurrency] Asynchronous functions

John_McCall · October 30, 2020, 6:35pm

Hi, folks.

Central to the overall Swift concurrency effort is the ability to directly express what we call asynchronous functions . An asynchronous function still returns a result or throws an error, but it does so asynchronously — that is, after a long enough potential delay that it's valuable to allow the thread to go on with other work in the meantime. Traditionally this has been expressed with explicit callbacks (often called "completion handlers" in the Apple ecosystem), but that sacrifices a large amount of meaningful structure. The result is something of a mess of problems, where asynchronous programs are far more awkward to write, more bug-prone, and less efficient than they ought to be. These problems can be straightforwardly solved by directly supporting this pattern in the language.

Our approach borrows heavily from the well-received " async / await " features in many other languages. I've included a portion of the proposal below; the full text can be found here. This feature ties in closely with the proposals for structured concurrency and actors.

Introduction

Modern Swift development involves a lot of asynchronous (or "async") programming using closures and completion handlers, but these APIs are hard to use. This gets particularly problematic when many asynchronous operations are used, error handling is required, or control flow between asynchronous calls gets complicated. This proposal describes a language extension to make this a lot more natural and less error prone.

This design introduces a coroutine model to Swift. Functions can opt into to being async, allowing the programmer to compose complex logic involving asynchronous operations using the normal control-flow mechanisms. The compiler is responsible for translating an asynchronous functions into an appropriate set of closures and state machines.

This proposal defines the semantics of asynchronous functions. However, it does not provide concurrency: that is covered by a separate proposal to introduce structured concurrency, which associates asynchronous functions with concurrently-executing tasks and provides APIs for creating, querying, and cancelling tasks.

This proposal draws some inspiration (and most of the Motivation section) from an earlier proposal written by
Chris Lattner and Joe Groff, available here. That proposal itself is derived from a proposal written by Oleg Andreev, available here. It has been significantly rewritten (again), and many details have changed, but the core ideas of asynchronous functions have remained the same.

Motivation: Completion handlers are suboptimal

Async programming with explicit callbacks (also called completion handlers) has many problems, which we’ll explore below. We propose to address these problems by introducing async functions into the language. Async functions allow asynchronous code to be written as straight-line code. They also allow the implementation to directly reason about the execution pattern of the code, allowing callbacks to run far more efficiently.

Problem 1: Pyramid of doom

A sequence of simple asynchronous operations often requires deeply-nested closures. Here is a made-up example showing this:

func processImageData1(completionBlock: (result: Image) -> Void) {
    loadWebResource("dataprofile.txt") { dataResource in
        loadWebResource("imagedata.dat") { imageResource in
            decodeImage(dataResource, imageResource) { imageTmp in
                dewarpAndCleanupImage(imageTmp) { imageResult in
                    completionBlock(imageResult)
                }
            }
        }
    }
}

processImageData1 { image in
    display(image)
}

This "pyramid of doom" makes it difficult to read and keep track of where the code is running. In addition, having to use a stack of closures leads to many second order effects that we will discuss next.

Problem 2: Error handling

Callbacks make error handling difficult and very verbose. Swift 2 introduced an error handling model for synchronous code, but callback-based interfaces do not derive any benefit from it:

func processImageData2(completionBlock: (result: Image?, error: Error?) -> Void) {
    loadWebResource("dataprofile.txt") { dataResource, error in
        guard let dataResource = dataResource else {
            completionBlock(nil, error)
            return
        }
        loadWebResource("imagedata.dat") { imageResource, error in
            guard let imageResource = imageResource else {
                completionBlock(nil, error)
                return
            }
            decodeImage(dataResource, imageResource) { imageTmp, error in
                guard let imageTmp = imageTmp else {
                    completionBlock(nil, error)
                    return
                }
                dewarpAndCleanupImage(imageTmp) { imageResult in
                    guard let imageResult = imageResult else {
                        completionBlock(nil, error)
                        return
                    }
                    completionBlock(imageResult)
                }
            }
        }
    }
}

processImageData2 { image, error in
    guard let image = image else {
        error("No image today")
        return
    }
    display(image)
}

The addition of Result to the standard library improved on error handling for Swift APIs. Asynchronous APIs were one of the main motivators for Result:

func processImageData2(completionBlock: (Result<Image>) -> Void) {
    loadWebResource("dataprofile.txt") { dataResourceResult in
        dataResourceResult.map { dataResource in
            loadWebResource("imagedata.dat") { imageResourceResult in
                imageResultResult.map { imageResource in
                    decodeImage(dataResource, imageResource) { imageTmpResult in
                        imageTmpResult.map { imageTmp in 
                            dewarpAndCleanupImage(imageTmp) { imageResult in
                                completionBlock(imageResult)
                            }
                        }
                    }
                }
            }
        }
    }
}

processImageData2 { result in
    switch result {
    case .success(let image):
        display(image)
    case .failure(let error):
        error("No image today")
    }
}

It's easier to properly thread the error through when using Result, making the code shorter. But, the closure-nesting problem remains.

Problem 3: Conditional execution is hard and error-prone

Conditionally executing an asynchronous function is a huge pain. For example, suppose we need to "swizzle" an image after obtaining it. But, we sometimes have to make an asynchronous call to decode the image before we can swizzle. Perhaps the best approach to structuring this function is to write the swizzling code in a helper "continuation" closure that is conditionally captured in a completion handler, like this:

func processImageData3(recipient: Person, completionBlock: (result: Image) -> Void) {
    let swizzle: (contents: image) -> Void = {
      // ... continuation closure that calls completionBlock eventually
    }
    if recipient.hasProfilePicture {
        swizzle(recipient.profilePicture)
    } else {
        decodeImage { image in
            swizzle(image)
        }
    }
}

This pattern inverts the natural top-down organization of the function: the code that will execute in the second half of the function must appear before the part that executes in the first half. In addition to restructuring the entire function, we must now think carefully about captures in the continuation closure, because the closure is used in a completion handler. The problem worsens as the number of conditionally-executed async functions grows, yielding what is essentially an inverted "pyramid of doom."

Problem 4: Many mistakes are easy to make

It's quite easy to bail-out of the asynchronous operation early by simply returning without calling the correct completion-handler block. When forgotten, the issue is very hard to debug:

func processImageData4(completionBlock: (result: Image?, error: Error?) -> Void) {
    loadWebResource("dataprofile.txt") { dataResource, error in
        guard let dataResource = dataResource else {
            return // <- forgot to call the block
        }
        loadWebResource("imagedata.dat") { imageResource, error in
            guard let imageResource = imageResource else {
                return // <- forgot to call the block
            }
            ...
        }
    }
}

When you do remember to call the block, you can still forget to return after that:

func processImageData5(recipient:Person, completionBlock: (result: Image?, error: Error?) -> Void) {
    if recipient.hasProfilePicture {
        if let image = recipient.profilePicture {
            completionBlock(image) // <- forgot to return after calling the block
        }
    }
    ...
}

Thankfully the guard syntax protects against forgetting to return to some degree, but it's not always relevant.

Problem 5: Because completion handlers are awkward, too many APIs are defined synchronously

This is hard to quantify, but the authors believe that the awkwardness of defining and using asynchronous APIs (using completion handlers) has led to many APIs being defined with apparently synchronous behavior, even when they can block. This can lead to problematic performance and responsiveness problems in UI applications, e.g. a spinning cursor. It can also lead to the definition of APIs that cannot be used when asynchrony is critical to achieve scale, e.g. on the server.

Proposed solution: async/await

Asynchronous functions—often known as async/await—allow asynchronous code to be written as if it were straight-line, synchronous code. This immediately addresses many of the problems described above by allowing programmers to make full use of the same language constructs that are available to synchronous code. The use of async/await also naturally preserves the semantic structure of the code, providing information necessary for at least three cross-cutting improvements to the language: (1) better performance for asynchronous code; (2) better tooling to provide a more consistent experience while debugging, profiling, and exploring code; and (3) a foundation for future concurrency features like task priority and cancellation. The example from the prior section demonstrates how async/await drastically simplifies asynchronous code:

func loadWebResource(_ path: String) async throws -> Resource
func decodeImage(_ r1: Resource, _ r2: Resource) async throws -> Image
func dewarpAndCleanupImage(_ i : Image) async throws -> Image

func processImageData2() async throws -> Image {
  let dataResource  = await try loadWebResource("dataprofile.txt")
  let imageResource = await try loadWebResource("imagedata.dat")
  let imageTmp      = await try decodeImage(dataResource, imageResource)
  let imageResult   = await try dewarpAndCleanupImage(imageTmp)
  return imageResult
}

Many descriptions of async/await discuss it through a common implementation mechanism: a compiler pass which divides a function into multiple components. This is important at a low level of abstraction in order to understand how the machine is operating, but at a high level we’d like to encourage you to ignore it. Instead, think of an asynchronous function as an ordinary function that has the special power to give up its thread. Asynchronous functions don’t typically use this power directly; instead, they make calls, and sometimes these calls will require them to give up their thread and wait for something to happen. When that thing is complete, the function will resume executing again.

The analogy with synchronous functions is very strong. A synchronous function can make a call; when it does, the function immediately waits for the call to complete. Once the call completes, control returns to the function and picks up where it left off. The same thing is true with an asynchronous function: it can make calls as usual; when it does, it (normally) immediately waits for the call to complete. Once the call completes, control returns to the function and it picks up where it was. The only difference is that synchronous functions get to take full advantage of (part of) their thread and its stack, whereas asynchronous functions are able to completely give up that stack and use their own, separate storage. This additional power given to asynchronous functions has some implementation cost, but we can reduce that quite a bit by designing holistically around it.

Because asynchronous functions must be able to abandon their thread, and synchronous functions don’t know how to abandon a thread, a synchronous function can’t ordinarily call an asynchronous function: the asynchronous function would only be able to give up the part of the thread it occupied, and if it tried, its synchronous caller would treat it like a return and try to pick up where it was, only without a return value. The only way to make this work in general would be to block the entire thread until the asynchronous function was resumed and completed, and that would completely defeat the purpose of asynchronous functions, as well as having nasty systemic effects.

In contrast, an asynchronous function can call either synchronous or asynchronous functions. While it’s calling a synchronous function, of course, it can’t give up its thread. In fact, asynchronous functions never just spontaneously give up their thread; they only give up their thread when they reach what’s called a suspension point, marked by await. A suspension point can occur directly within a function, or it can occur within another asynchronous function that the function calls, but in either case the function and all of its asynchronous callers simultaneously abandon the thread. (In practice, asynchronous functions are compiled to not depend on the thread during an asynchronous call, so that only the innermost function needs to do any extra work.)

When control returns to an asynchronous function, it picks up exactly where it was. That doesn’t necessarily mean that it’ll be running on the exact same thread it was before, because the language doesn’t guarantee that after a suspension. In this design, threads are mostly an implementation mechanism, not a part of the intended interface to concurrency. However, many asynchronous functions are not just asynchronous: they’re also associated with specific actors (which are the subject of a separate proposal), and they’re always supposed to run as part of that actor. Swift does guarantee that such functions will in fact return to their actor to finish executing. Accordingly, libraries that use threads directly for state isolation—for example, by creating their own threads and scheduling tasks sequentially onto them—should generally model those threads as actors in Swift in order to allow these basic language guarantees to function properly.

Suspension points

A suspension point is a point in the execution of an asynchronous function where it has to give up its thread. Suspension points are always associated with some deterministic, syntactically explicit event in the function; they’re never hidden or asynchronous from the function’s perspective. The detailed language design will describe several different operations as suspension points, but the most important one is a call to an asynchronous function associated with a different execution context.

It is important that suspension points are only associated with explicit operations. In fact, it’s so important that this proposal requires that calls that might suspend be enclosed in an await expression. This follows Swift's precedent of requiring try expressions to cover calls to functions that can throw errors. Marking suspension points is particularly important because suspensions interrupt atomicity. For example, if an asynchronous function is running within a given context that is protected by a serial queue, reaching a suspension point means that other code can be interleaved on that same serial queue. A classic but somewhat hackneyed example where this atomicity matters is a modeling a bank: if a deposit is credited to one account, but the operation suspends before processing a matched withdrawal, it creates a window where those funds can be double-spent. A more germane example for many Swift programmers is a UI thread: the suspension points are the points where the UI can be shown to the user, so programs that build part of their UI and then suspend risk presenting a flickering, partially-constructed UI. (Note that suspension points are also called out explicitly in code using explicit callbacks: the suspension happens between the point where the outer function returns and the callback starts running.) Requiring that all suspension points are marked allows programmers to safely assume that places without suspension points will behave atomically, as well as to more easily recognize problematic non-atomic patterns.

Because suspension points can only appear at points explicitly marked within an asynchronous function, long computations can still block threads. This might happen when calling a synchronous function that just does a lot of work, or when encountering a particularly intense computational loop written directly in an asynchronous function. In either case, the thread cannot interleave code while these computations are running, which is usually the right choice for correctness, but can also become a scalability problem. Asynchronous programs that need to do intense computation should generally run it in a separate context. When that’s not feasible, there will be library facilities to artificially suspend and allow other operations to be interleaved.

Asynchronous functions should avoid calling functions that can actually block the thread, especially if they can block it waiting for work that’s not guaranteed to be currently running. For example, acquiring a mutex can only block until some currently-running thread gives up the mutex; this is sometimes acceptable but must be used carefully to avoid introducing deadlocks or artificial scalability problems. In contrast, waiting on a condition variable can block until some arbitrary other work gets scheduled that signals the variable; this pattern goes strongly against reccomendation. Ongoing library work to provide abstractions that allow programs to avoid these pitfalls will be required.

This design currently provides no way to prevent the current context from interleaving code while an asynchronous function is waiting for an operation in a different context. This omission is intentional: allowing for the prevention of interleaving is inherently prone to deadlock.

Asynchronous calls

Calls to an async function look and act mostly like calls to a synchronous (or ordinary) function. The apparent semantics of a call to an async function are:

Arguments are evaluated using the ordinary rules, including beginning accesses for any inout parameters.
The callee’s executor is determined. This proposal does not describe the rules for determining the callee's executor; see the complementary proposal about actors.
If the callee’s executor is different from the caller’s executor, a suspension occurs and the partial task to resume execution in the callee is enqueued on the callee’s executor.
The callee is executed with the given arguments on its executor.
During the return, if the callee’s executor is different from the caller’s executor, a suspension occurs and the partial task to resume execution in the caller is enqueued on the caller’s executor.
Finally, the caller resumes execution on its executor. If the callee returned normally, the result of the call expression is the value returned by the function; otherwise, the expression throws the error that was thrown from the callee.

From the caller's perspective, async calls behave similarly to synchronous calls, except that they may execute on a different executor, requiring the task to be briefly suspended. Note also that the duration of inout accesses is potentially much longer due to the suspension over the call, so inout references to shared mutable state that is not sufficiently isolated are more likely to produce a dynamic exclusivity violation.

DeFrenZ · October 30, 2020, 6:50pm

I'm still reading the rest of the post, but isn't this Result-based example "wrong"? As in, doesn't it call completionBlock only in the case where it all succeeds, or the error happens on dewarpAndCleanupImage?
I mean it would just be an additional example in how easy it is to ignore errors when you have to deal with the Pyramid of Doom™

John_McCall · October 30, 2020, 6:59pm

Yes, that's a great point. The explicit use of callbacks is extremely susceptible to this sort of problem where the programmer (or pitch writer :)) forgets to consider a particular control-flow path and fails to call the completion handler.

DeFrenZ · October 30, 2020, 8:14pm

I'm not sure if discussion/corrections should be done here or on GitHub...

I suppose this should be async throws

I now finished to read this one and thanks for the huge work!

Jumhyn · October 30, 2020, 8:19pm

Is there any meaningful analogue to rethrows in an async/await world, i.e., a function which is async if and only if a provided closure parameter is async?

John_McCall · October 30, 2020, 8:20pm

Thank you, fixed

Lantua · October 30, 2020, 8:29pm

Maybe we can add reasync later should it be needed? At least the current proposal doesn't seem to clash with that.

John_McCall · October 30, 2020, 8:29pm

It's hard to say. I can certainly imagine situations where it might be useful, but I'm not sure there are enough to justify the complexity cost, which behind the scenes would be quite substantial — we'd essentially have to rig up a complete task just to call the function, then throw it away. The thing is that many higher-order functions probably need to think carefully about how they ought to work with an async function — guaranteeing sequential use is not necessarily what clients would actually want.

The most important use case would probably be withFoo functions that introduce a scoped value, and I'd really prefer to address those with a more targeted coroutine feature — they're problematic for a lot of other things besides async.

Lantua · October 30, 2020, 8:48pm

This design currently provides no way to prevent the current context from interleaving code while an asynchronous function is waiting for an operation in a different context. This omission is intentional: allowing for the prevention of interleaving is inherently prone to deadlock.

This should also be in the reference doc once it's included/accepted. It took a lot of cross-reading to find this sentence that makes sense of a lot of design decisions on the other threads. It's also easily surprising.

Rationale : This order restriction is arbitrary, but it's not harmful, and it eliminates the potential for stylistic debates.

Praise be.

ydnar · October 30, 2020, 7:48pm

Why is async specified after the func declaration, rather than before, e.g. async func f()?

Douglas_Gregor · October 30, 2020, 7:50pm

It's the same place where throws is specified, because it has a similar role in the type system.

Doug

staninprague · October 30, 2020, 8:13pm

Good explanation! I wonder though why:

await other.asyncFunction(otherActor: self)

In rust, one would have:

other.asyncFunction(otherActor: self).await

and optionally:

other.asyncFunction(otherActor: self).await?

Which is quite handy as other.asyncFunction(otherActor: self) returns a Future? await makes this the type returned by the Future? With .await you can chain on this? With rust if this is a Result<Success, Error> type, then ? resolves it to Success or returns Error from the "block".

Will await other.asyncFunction(otherActor: self) need parentheses to chain on its results?

ydnar · October 30, 2020, 8:25pm

Async is an adjective, and throws is a verb, so adjective-object-verb makes sense. How/why does the type system factor into this?

Jumhyn · October 30, 2020, 8:28pm

async and throws are both part of the type of a function. I.e, if you have

func foo() async throws {}

let bar = foo

then the type of bar is () -> () async throws. OTOH, if you had something like:

private func foo() {}

public let bar = foo()

then the type of bar is just () -> () (i.e., private is not part of the type).

Lantua · October 30, 2020, 8:31pm

Unrelated, but what's that language? It doesn't seem to be English.

ydnar · October 30, 2020, 8:32pm

Subject, rather.

mayoff · October 30, 2020, 8:44pm

Or, async is an adverb, being an abbreviation for “asynchronously”: () async -> Int is a function that asynchronously returns an integer.

John_McCall · October 30, 2020, 8:50pm

You can find a detailed discussion of this in the asynchronous functions proposal. The short answer is that await is like try and doesn't need to be pedantically placed on the exact asynchronous operation, as long as it logically "covers" it.

Jay-Madden · October 30, 2020, 9:03pm

Are there any plans on naming conventions for async functions? for example in C# its convention to postfix async function names with "async". Is this something that has been discussed for swift?

Jon_Shier · October 30, 2020, 9:09pm

It doesn't seem necessary to postfix the name when the async attributes requires the use of await at the call site.