[Concurrency] Structured concurrency

Hi, folks.

Any concurrency design has to provide some way of actually running code concurrently. We propose embracing the approach of structured concurrency, in which concurrent programs are organized into tasks and child tasks rather than simply a sea of formally-unrelated threads and jobs. Structured concurrency allows us to neatly address a large number of systemic problems with task management that would otherwise need a lot of ad hoc support.

This is part of a series of pitches that jointly form the Swift concurrency proposal. You can find a roadmap for the entire proposal here. I have posted a portion of the proposal below; the full text is also available. This proposal is closely related to the proposal for asynchronous functions; it is probably best to start with that proposal, then read this one.

The text of this proposal has changed since this initial post; please read the copy at the link rather than the text below.


Introduction

async/await is a language mechanism for writing natural, efficient asynchronous code. Asynchronous functions (introduced with async) can give up the thread on which they are executing at any given suspension point (marked with await), which is necessary for building highly-concurrent systems.

However, the async/await proposal does not introduce concurrency per se: ignoring the suspension points within an asynchronous function, it will execute in essentially the same manner as a synchronous function. This proposal introduces support for structured concurrency in Swift, enabling concurrency execution of asynchronous code with a model that is ergonomic, predictable, and admits efficient implementation.

Motivation

For a simple example, let's make dinner, asynchronously:

func chopVegetables() async throws -> [Vegetable] { ... }
func marinateMeat() async -> Meat { ... }
func preheatOven(temperature: Double) async throws -> Oven { ... }

// ...

func makeDinner() async throws -> Meal {
  let veggies = await try chopVegetables()
  let meat = await marinateMeat()
  let oven = await try preheatOven(temperature: 350)

  let dish = Dish(ingredients: [veggies, meat])
  return await try oven.cook(dish, duration: .hours(3))
}

Each step in our dinner preparation is an asynchronous operation, so there are numerous suspension points. While waiting for the vegetables to be chopped, makeDinner won't block a thread: it will suspend until the vegetables are available, then resume. Presumably, many dinners could be in various stages of preparation, with most suspended until their current step is completed.

However, even though our dinner preparation is asynchronous, it is still sequential. It waits until the vegetables have been chopped before starting to marinate the meat, then waits again until the meat is ready before preheating the oven. Our hungry patrons will be very hungry indeed by the time dinner is finally done.

To make dinner preparation go faster, we need to perform some of these steps concurrently. To do so, we can break down our recipe into different tasks that can happen in parallel. The vegetables can be chopped at the same time that the meat is marinating and the oven is preheating. Sometimes there are dependencies between tasks: as soon as the vegetables and meat are ready, we can combine them in a dish, but we can't put that dish into the oven until the oven is hot. All of these tasks are part of the larger task of making dinner. When all of these tasks are complete, dinner is served.

This proposal aims to provide the necessary tools to carve work up into smaller tasks that can run concurrently, to allow tasks to wait for each other to complete, and to effectively manage the overall progress of a task.

Proposed solution

Our approach follows the principles of structured concurrency. All asynchronous functions run as part of an asynchronous task. Tasks can conveniently make child tasks that will perform work concurrently. This creates a hierarchy of tasks, and information can conveniently flow up and down the hierarchy, making it convenient to manage the whole thing holistically.

Child tasks

This proposal introduces an easy way to create child tasks with async let:

func makeDinner() async throws -> Meal {
  async let veggies = try chopVegetables()
  async let meat = marinateMeat()
  async let oven = try preheatOven(temperature: 350)

  let dish = Dish(ingredients: await [veggies, meat])
  return await try oven.cook(dish, duration: .hours(3))
}

async let is similar to a let, in that it defines a local constant that is initialized by the expression on the right-hand side of the =. However, it differs in that the initializer expression is evaluated in a separate, concurrently-executing child task. On completion, the child task will initialize the variables in the async let and complete.

Because the main body of the function executes concurrently with its child tasks, it is possible that makeDinner will reach the point where it needs the value of an async let (say, veggies) before that value has been produced. To account for that, reading a variable defined by an async let is treated as a suspension point, and therefore must be marked with await. The task will suspend until the child task has completed initialization of the variable, and then resume.

One can think of async let as introducing a (hidden) future, which is created at the point of declaration of the async let and whose value is retrieved at the await. In this sense, async let is syntactic sugar to futures.

However, child tasks in the proposed structured-concurrency model are (intentionally) more restricted than general-purpose futures. Unlike in a typical futures implementation, a child task does not persist beyond the scope in which is was created. By the time the scope exits, the child task must either have completed, or it will be implicitly cancelled. This structure both makes it easier to reason about the concurrent tasks that are executing within a given scope, and also unlocks numerous optimization opportunities for the compiler and runtime.

Bringing it back to our example, note that the chopVegetables() function might throw an error if, say, there is an incident with the kitchen knife. That thrown error completes the child task for chopping the vegetables. The error will then be propagated out of the makeDinner() function, as expected. On exiting the body of the makeDinner() function, any child tasks that have not yet completed (marinating the meat or preheating the oven, maybe both) will be automatically cancelled.

Nurseries

The async let construct makes it easy to create a set number of child tasks and associate them with variables. However, the construct does not work as well with dynamic workloads, where we don't know the number child tasks we will need to create because (for example) it is dependent on the size of a data structure. For that, we need a more dynamic construct: a task nursery.

A nursery defines a scope in which one can create new child tasks programmatically. As with all child tasks, the child tasks within the nursery must complete when the scope exits or they will be implicitly cancelled. Nurseries also provide utilities for working with the child tasks, e.g., by waiting until the next child task completes.

To stretch our example even further, let's consider our chopVegetables() operation, which produces an array of Vegetable values. With enough cooks, we could chop our vegetables even faster if we divided up the chopping for each kind of vegetable.

Let's start with a sequential version of chopVegetables():

/// Sequentially chop the vegetables.
func chopVegetables() async throws -> [Vegetable] {
  var veggies: [Vegetable] = gatherRawVeggies()
  for i in veggies.indices {
    veggies[i] = await try veggies[i].chopped()
  }
  return veggies
}

Introducing async let into the loop would not produce any meaningful concurrency, because each async let would need to complete before the next iteration of the loop could start. To create child tasks programmatically, we introduce a new nursery scope via withNursery:

/// Sequentially chop the vegetables.
func chopVegetables() async throws -> [Vegetable] {
  // Create a task nursery where each task produces (Int, Vegetable).
  Task.withNursery(resultType: (Int, Vegetable).self) { nursery in 
    var veggies: [Vegetable] = gatherRawVeggies()
    
    // Create a new child task for each vegetable that needs to be 
    // chopped.
    for i in rawVeggies.indices {
      await try nursery.add { 
        (i, veggies[i].chopped())
      }
    }

    // Wait for all of the chopping to complete, slotting each result
    // into its place in the array as it becomes available.
    while let (index, choppedVeggie) = await try nursery.next() {
      veggies[index] = choppedVeggie
    }
    
    return veggies
  }
}

The withNursery(resultType:body:) function introduces a new scope in which child tasks can be created (using the nursery's add(_:) method). The next() method waits for the next child task to complete, providing the result value from the child task. In our example above, each child task carries the index where the result should go, along with the chopped vegetable.

As with the child tasks created by async let, if the closure passed to withNursery exits without having completed all child tasks, any remaining child tasks will automatically be cancelled.

Detached tasks

Thus far, every task we have created is a child task, whose lifetime is limited by the scope in which is created. This does not allow for new tasks to be created that outlive the current scope.

The runDetached operation creates a new task. It accepts a closure, which will be executed as the body of the task. Here, we create a new, detached task to make dinner:

let dinnerHandle = Task.runDetached {
  await makeDinner()
}

The result of runDetached is a task handle, which can be used to retrieve the result of the operation when it completes (via get()) or cancel the task if the result is no longer desired (via cancel()). Unlike child tasks, detached tasks aren't cancelled even if there are no remaining uses of their task handle, so runDetached is suitable for operations for which the program does not need to observe completion.

47 Likes

There's a kind of confusing note in the Nurseries: Parent task cancellation section:

NOTE: Presently nurseries do not automatically check for cancellation. They could for example check for it when adding new tasks, such that nursery.add() would throw if the nursery is cancelled -- so we don't needlessly keep adding more work while our parent task has been cancelled already anyway. This would require add to be async and throwing which makes the API a bit unwieldly.

Elsewhere in the proposal, add appears to be both async and throws; it's called like await try nursery.add { ā€¦ }.

It seems like the note might be obsolete, and Nursery.add does check for cancellation and throw. Is that right?

1 Like

iinm the need for await try here might not be because add itself is async throws but because the closure passes as parameter contains a call to an async throwing method, namely veggies[i].chopped(); and since, similar to try, you can also use a single await in front of the whole expression if an expression contains one or more async calls in the full expression/chain, await try add { v.chopped() } would likely be valid even if arguably add { await try v.chopped() } might be less confusing for understanding what part of the whole expression actually suspends and can throw?

While this makes sense, it seems to me like another option would be to have async replace await directly in an expression, instead of modifying let.

That is why was a syntax like the following not considered:

func makeDinner() async throws -> Meal {
 let veggies = try async chopVegetables()
 let meat = async marinateMeat()
 let oven = try async preheatOven(temperature: 350)

  let dish = Dish(ingredients: await [veggies, meat])
  return await try oven.cook(dish, duration: .hours(3))
}
3 Likes

I don't have a concrete opinion on that syntax yet, but my first instinct is that it looks quite similar to await. :thinking::thinking:

Two questions:

  1. The pitch mentions "checking" for cancellation. This is fine for predicates (e.g. if not cancelled, continue) but I wouldn't want to poll this value to detect when I should cancelling long-running tasks. It's not clear from the pitch how more-explicit cancellation work (e.g. cancelling DispatchSource, URLSession tasks, etc) would work. Is there an expectation that Task would offer a configurable closure to run on cancellation?

  2. Is it expected that a Task produce a single result? How do you envisage event sources, streams and other kinds of processing occurring? Would they require an Actor to receive the result?

A minor point but I'm not a fan of the aesthetics of

Task.withNursery(resultType: (Int, Vegetable).self) { nursery in

I think Task<(Int, Vegetable)> { nursery in would be much better, if you could find a way to make to make it work.

13 Likes

Yes, thatā€™s correct, Iā€™ll fix it. Nurseries are meant to be responsive both to cancellation and to back-pressure, at least by default. We may also provide a way to create child tasks unconditionally, for cases where it is semantically necessary, but it would be discouraged.

Which line throws this errorā€”the line where chopVegetables is called, or where veggies is await'ed? If, for instance, the developer wanted to handle the error within makeDinner.

I'm assuming it would be the line where veggies is await'ed, for ergonomics. That's where the developer actually cares about whether the call succeeded or failed, and I can't imagine what the alternative would look like in terms of control flowā€”the whole point of async let is to let the caller defer evaluation and continue, so I'm not sure how the function would "jump back" there for error handling.

But then, shouldn't it be the await try instead of async letā€¦ = tryā€¦?

EDIT: Upon reading the full text of the proposal, I see

If the initializer of the async let can throw an error, then each reference to a variable declared within that async let is considered to throw an error, and therefore must also be enclosed in one of try / try! / try? .

So I think that perhaps the makeDinner example should read await try [veggies, meat]. But the full text of the proposal uses async letā€¦ = try in its examples as well, and I remain confused about why that is.

1 Like

Yes, that is correct. Semantically the "place where the (potentially throwing) async let" is awaited on will throw. I think you're right that the current spelling in our example then reads a bit misleading. Think about it even in terms of time -- the async computation is in flight, on the line of the async let, it's not really throwing "yet", it only throws when it completes so it makes sense that that would be at the await site -- where we synchronize with the completion.

Any await that involves a potentially throwing async let I suppose should also be marked await try - that sounds reasonable to me.

4 Likes

Why is UnsafeContinuation a task, and not a (currying) function? I think that's a better place for an escape hatch.

Yes, exactly. There will be a way for a task to register that it needs to react instantaneously to cancellation. I know what a good, thread-safe implementation looks like there, but I'm still figuring out how best to expose it to users because of the somewhat awkward interaction with withUnsafeContinuation ā€” the function typically won't be available until after the program makes some call (which returns a token which can be cancelled somehow), but that's not appropriately nested with the withUnsafeContinuation block.

Asynchronous streams are an active point of design. I would expect that at the basic level there would be some sort of interaction between tasks there, but not an inherent connection to actors. If the generating or receiving task needs an actor, that's its business.

4 Likes

Can you explain what you're thinking of a little more clearly?

Sorry, I got lost in all the Unsafe keyword. nvm that.


On a separated note, given that Task.withUnsafeThrowingContinuation throws an (untyped) error, maybe the error in UnsafeThrowingContinuation should also be untyped?

struct UnsafeThrowingContinuation<T> {
  func resume(returning: T)
  func resume(throwing: Error)
}

since it could also be catching an untyped error:

Task.withUnsafeThrowingContinuation { continuation in
  doSomething {
    try {
      ...
    catch {
      continuation.resume(throwing: error)
      return
    }
    continuation.resume(returning: ...)
  }
}
1 Like

Since (unlike some other cases with task handles) we're not planning on ever injecting different errors on ordinary continuation boundaries, making throwing continuations untyped would preclude adding typed errors in the future.

1 Like

I see, though I think that when/if we need to inject error type, we'd need to introduce a new overload anyway, at which point it's just be unnecessary generic. We likely have different standard for future-proofing so that as far as my arguments go.

Regarding the syntax, it would be nice if the async keyword was consistently placed to the left of what is being made asynchronous.

For example this seems to make more sense regarding naming consistency:

async func getFoo() -> String { ... }
async let foo = getFoo()

2 Likes

Bikeshedding the spelling of async functions is firmly outside the scope of this conversation, but I think this does bring up a fair point: it is conceptually more of an async let or rather more helpful to think of it as an async =?

IMO, async let x = reads better than let x async = because the former keeps the keywords together rather than splitting them around the variable. And I think that consistency with throws is far more important than consistency between uses of async.

6 Likes

This wasn't a bikeshedding question. The question is, in the model we surface to the user, is it the let binding that's async or rather the assignment that's async? (Obviously, if conceptually it makes sense to have a notion of "async assignment," it can be spelled however we feel is most elegant. For bikeshedding purposes we could imagine a different assignment operator such as <-.)

One case to think about with respect to this question: does it make sense for this functionality to compose with if let? If so, then it seems to me that it's more the assignment that's async. If not, then it kind of seems like async let is a different creature than an ordinary let binding.

If I understand the feature correctly itā€™s neither (the value is bound and assigned to something, just not the end value). This usage of async is really just the parallel version of await (barring any subtleties Iā€™m not aware of), and so the usage should really just be let x = async y(). I think this would best indicate to the user that this usage is different than await and would parallel the current model of DispatchQueue.async. It wasnā€™t clear from the proposal why moving the modifier to the declaration was chosen in the async case.

2 Likes