How is an async function actually executed?

In Swift Concurrency, how is an async function actually executed?

When it hits an await, it gives up its thread so the system can do other work, and later resumes using a continuation when the async operation completes.

But during that suspended period, where is the async function being ‘processed’ if it’s not on a thread?

I thought this talk might be a good one to catch up on—especially the first ~20 minutes, which cover details about suspension and threads in depth, with helpful illustrations.

7 Likes

In most cases it can simply reuse the thread that was just given up (this is called "executor stealing"), which is very good for e.g. cache locality and minimizing context switching overhead. In some cases, it will ask the underlying system (via libdispatch on Darwin-based OSs) to spawn a thread for it, up to one thread per QoS level per core. For the main actor, specifically, there's a dedicated thread.

5 Likes

That video that kersten shared with us outlines the basic process: At a suspension point, the current async frame is saved (in the heap), the thread is released to perform other work, and the asynchronously called routine gets its own stack frame. When that asynchronously called function finishes and returns its result, the previously saved async frame is restored, including its prior stack, and the continuation executes.

So it is less that the async function is being “processed” anywhere while suspended, but rather that it that its async frame merely is saved for us, and this frame is subsequently restored when its continuation runs.

4 Likes

It may actually help to break this down further.

Tasks are abstractions that provide the illusion of a continuous computation sequence. They are provided by the Swift concurrency runtime. Behind the scenes, a task is executed by one thread at a time, but it can also be suspended, and when it is suspended, there is no thread responsible for executing it at all. When a task suspends, it is always for a specific reason, and when that reason resolves itself, the concurrency runtime schedules a thread to resume executing the task.

Threads themselves are also abstractions that provide the illusion of a continuous computation sequence. They are provided by (typically) the operating system kernel. Behind the scenes, a thread is executed by one CPU core at a time, but it can also be suspended, and when it is suspended, there is no CPU core responsible for executing it at all. When a thread suspends, it is always for a specific reason, and when that reason resolves itself, the OS schedules a CPU core to resume executing the thread.

Tasks and threads thus have a lot in common, and you can see a task as a sort of thread abstraction implemented on top of threads. The difference is mainly that the technical design of Swift's tasks intentionally trades speed (async calls are much more expensive than sync calls) for lower resource usage during suspension and thus better overall scaling. This is necessary in typical application environments because kernel threads are a heavyweight resource, as required by typical platform ABIs, and Swift is tied closely to that ABI because of its C interoperation goals. But in fact we've also investigated alternative implementation strategies that map tasks 1-1 to threads, and this is an option in Embedded Swift, in support of targets that assign special meaning to threads and don't want Swift to invent its own blocking mechanisms or to abandon threads.

14 Likes

The async/await Proposal: SE-0296 can help you improve your understanding of the subject.

Summary

Proposed solution: async/await

Asynchronous functions—often known as async/await—allow asynchronous code to be written as if it were straight-line, synchronous code. This immediately addresses many of the problems described above by allowing programmers to make full use of the same language constructs that are available to synchronous code. The use of async/await also naturally preserves the semantic structure of the code, providing information necessary for at least three cross-cutting improvements to the language: (1) better performance for asynchronous code; (2) better tooling to provide a more consistent experience while debugging, profiling, and exploring code; and (3) a foundation for future concurrency features like task priority and cancellation. The example from the prior section demonstrates how async/await drastically simplifies asynchronous code:

func loadWebResource(_ path: String) async throws -> Resource
func decodeImage(_ r1: Resource, _ r2: Resource) async throws -> Image
func dewarpAndCleanupImage(_ i : Image) async throws -> Image

func processImageData() async throws -> Image {
  let dataResource  = try await loadWebResource("dataprofile.txt")
  let imageResource = try await loadWebResource("imagedata.dat")
  let imageTmp      = try await decodeImage(dataResource, imageResource)
  let imageResult   = try await dewarpAndCleanupImage(imageTmp)
  return imageResult
}

Many descriptions of async/await discuss it through a common implementation mechanism: a compiler pass which divides a function into multiple components. This is important at a low level of abstraction in order to understand how the machine is operating, but at a high level we’d like to encourage you to ignore it. Instead, think of an asynchronous function as an ordinary function that has the special power to give up its thread. Asynchronous functions don’t typically use this power directly; instead, they make calls, and sometimes these calls will require them to give up their thread and wait for something to happen. When that thing is complete, the function will resume executing again.

The analogy with synchronous functions is very strong. A synchronous function can make a call; when it does, the function immediately waits for the call to complete. Once the call completes, control returns to the function and picks up where it left off. The same thing is true with an asynchronous function: it can make calls as usual; when it does, it (normally) immediately waits for the call to complete. Once the call completes, control returns to the function and it picks up where it was. The only difference is that synchronous functions get to take full advantage of (part of) their thread and its stack, whereas asynchronous functions are able to completely give up that stack and use their own, separate storage. This additional power given to asynchronous functions has some implementation cost, but we can reduce that quite a bit by designing holistically around it.

Because asynchronous functions must be able to abandon their thread, and synchronous functions don’t know how to abandon a thread, a synchronous function can’t ordinarily call an asynchronous function: the asynchronous function would only be able to give up the part of the thread it occupied, and if it tried, its synchronous caller would treat it like a return and try to pick up where it was, only without a return value. The only way to make this work in general would be to block the entire thread until the asynchronous function was resumed and completed, and that would completely defeat the purpose of asynchronous functions, as well as having nasty systemic effects.

In contrast, an asynchronous function can call either synchronous or asynchronous functions. While it’s calling a synchronous function, of course, it can’t give up its thread. In fact, asynchronous functions never just spontaneously give up their thread; they only give up their thread when they reach what’s called a suspension point. A suspension point can occur directly within a function, or it can occur within another asynchronous function that the function calls, but in either case the function and all of its asynchronous callers simultaneously abandon the thread. (In practice, asynchronous functions are compiled to not depend on the thread during an asynchronous call, so that only the innermost function needs to do any extra work.)

When control returns to an asynchronous function, it picks up exactly where it was. That doesn’t necessarily mean that it’ll be running on the exact same thread it was before, because the language doesn’t guarantee that after a suspension. In this design, threads are mostly an implementation mechanism, not a part of the intended interface to concurrency. However, many asynchronous functions are not just asynchronous: they’re also associated with specific actors (which are the subject of a separate proposal), and they’re always supposed to run as part of that actor. Swift does guarantee that such functions will in fact return to their actor to finish executing. Accordingly, libraries that use threads directly for state isolation—for example, by creating their own threads and scheduling tasks sequentially onto them—should generally model those threads as actors in Swift in order to allow these basic language guarantees to function properly.

Suspension points

A suspension point is a point in the execution of an asynchronous function where it has to give up its thread. Suspension points are always associated with some deterministic, syntactically explicit event in the function; they’re never hidden or asynchronous from the function’s perspective. The primary form of suspension point is a call to an asynchronous function associated with a different execution context.

It is important that suspension points are only associated with explicit operations. In fact, it’s so important that this proposal requires that calls that might suspend be enclosed in an await expression. These calls are referred to as potential suspension points, because it is not known statically whether they will actually suspend: that depends both on code not visible at the call site (e.g., the callee might depend on asynchronous I/O) as well as dynamic conditions (e.g., whether that asynchronous I/O will have to wait to complete).

The requirement for await on potential suspension points follows Swift's precedent of requiring try expressions to cover calls to functions that can throw errors. Marking potential suspension points is particularly important because suspensions interrupt atomicity. For example, if an asynchronous function is running within a given context that is protected by a serial queue, reaching a suspension point means that other code can be interleaved on that same serial queue. A classic but somewhat hackneyed example where this atomicity matters is modeling a bank: if a deposit is credited to one account, but the operation suspends before processing a matched withdrawal, it creates a window where those funds can be double-spent. A more germane example for many Swift programmers is a UI thread: the suspension points are the points where the UI can be shown to the user, so programs that build part of their UI and then suspend risk presenting a flickering, partially-constructed UI. (Note that suspension points are also called out explicitly in code using explicit callbacks: the suspension happens between the point where the outer function returns and the callback starts running.) Requiring that all potential suspension points are marked allows programmers to safely assume that places without potential suspension points will behave atomically, as well as to more easily recognize problematic non-atomic patterns.

Because potential suspension points can only appear at points explicitly marked within an asynchronous function, long computations can still block threads. This might happen when calling a synchronous function that just does a lot of work, or when encountering a particularly intense computational loop written directly in an asynchronous function. In either case, the thread cannot interleave code while these computations are running, which is usually the right choice for correctness, but can also become a scalability problem. Asynchronous programs that need to do intense computation should generally run it in a separate context. When that’s not feasible, there will be library facilities to artificially suspend and allow other operations to be interleaved.

Asynchronous functions should avoid calling functions that can actually block the thread, especially if they can block it waiting for work that’s not guaranteed to be currently running. For example, acquiring a mutex can only block until some currently-running thread gives up the mutex; this is sometimes acceptable but must be used carefully to avoid introducing deadlocks or artificial scalability problems. In contrast, waiting on a condition variable can block until some arbitrary other work gets scheduled that signals the variable; this pattern goes strongly against recommendation.

Another proposal: Actors - Proposal: SE-0306

Although they are well written, you may need to read them a couple of times to digest.

3 Likes

@zekexros, may I ask what the context of your question is?
I'm just asking because over the last one or two years I have talked about the concurrency model in Swift with several colleagues & students and often noticed that these kinds of discussions and interesting questions unveil various "mental models" that people have about how their code works.
Sometimes they clash with what's actually happening and I always find it exciting to learn something new about how the code "really" works. Indeed, learning how Swift models concurrency has taught me much about the underlying problems.

Because of that, I am always looking for anecdotes of people tackling these concepts, from a sort of "human computer interaction" or educational point of view. This might also help the wider community to see where/how documentation can be improved to help learn this all.