[Pitch] Improve Async/Await Parallelization Ergonomics

Hey, I wanted to share a couple of potential options for improving upon async let and (Throwing)TaskGroup. Let me know what you think!

Pitch: Improve Async/Await Parallelization Ergonomics

Introduction

I've recently run a poll on Mastsodon, and I was happy to learn that I wasn't alone in not having a good mental model of how to execute work in parallel using async/await:

Poll: do these requests execute sequentially or in parallel?

let (image1, image2) = try await (
    session.image(for: request1), 
    session.image(for: request2)
)

40% of people answered "parallel", including me. 

As a related area, I think it's fair to say that the APIs provided by both TaskGroup and its convenience wrapper async let are fairly complex and not easily discoverable. There needs to be a simple way to achieve basic tasks like performing an async map or executing two tasks in parallel without the need for intermediate async let properties. While some of it could be done by introducing new ad-hoc APIs like a parallel async map, I think there may be an alternative that better fits Swift.

Proposed Solution

This pitch outlines how Async/Await could support parallel execution with the first-class future type. The previous decisions were made early in the Swift Concurrency development, and with the new learnings and the renewed focus on ergonomics, it could be a great time to revise some of them.

Base scenario. There are two await calls and two clear suspension points – sequential execution:

let image1 = await session.image(for: url1)
let image2 = await session.image(for: url2)

A single await for futures in a tuple – parallel execution:

let (image1, image2) = await (
    session.image(for: url1),
    session.image(for: url2)
)

A single await for a sequence of futures – parallel execition:

let images = await [
    session.image(for: url1), 
    session.image(for: url2)
]

A single await for a sequence of futures created using synchronous map – parallel execution:

let images = await urls.map { 
    session.image(for: $0)
}

The new API will compose well with throws, eliminating the need for a ThrowingTaskGroup:

let images = try await urls.map(session.image)

Mental Model

The mental modal for parallel vs sequential executing is simple: futures start executing immediately when they are created. If there is only one suspension point (await), the execution can only be parallel. This clears any confusion around the original example with a tuple.

Conditionally Asyncronous

One of the potential future additions could be an analog of the rethrows keywords but for async to make closures conditionally asynchronous if they have any suspension points. An async version of map could look like:

let images = await urls.map {
    await session.image(for: $0)
}

If you apply the same mental model for parallel execution, since there is one await per URL, the execution will be serial. This version is not as important as you can already achieve the same with a simple for loop. The new version may open a way to eliminate some of the existing ad-hoc higher order functions.

Note: this version might be familiar for folks with functional reactive programming background, but it could also creates a situation where subtle syntax changes significantly alter the execution pattern, so it could be something to avoid.

Potential Implementation

When you invoke an async function, it returns a new Future<Value, Error> type. You can use await to unwrap it and retrieve a value. The await keyword can also be used on boxed types like sequences of Futures. The returned results isn't discardable.

The future starts executing immediately, which ensures that there is no changes to how task tree management works, including cancellation and priority propagation. If the async function requires a separate actor, the future automatically switches to it first.

There will be other questions like what happens if you don't await on a future, but, based on the precedence from other languages, it is a non-problem. It's not a common case, and you'll have to explicitly discard the future to do it (similarly to how you can invoke a callback-based function without providing a callback). It should continue running unless the task containing it is canceled.

The original structured concurrency proposal doesn't close the door on the idea of introducing explicit futures.

Language Fit

This direction is in line with the Swift language design. One similar early example is Optional, which was intentionally elevated to be a type and not a compiler primitive. The new type will require only the minimal compiler magic to allow using await on futures. It could also open possibilities for other novel APIs that compose well with other language features, just like Optional did.

Similar solutions in other languages: JavaScript's Promises and Kotlin's Deferred.

Source Compatibility

This change can be introduced with a new major compiler version bundled with other Swift Concurrency and Data Race Safety changes.

This is an oversimplification, but some if not most of the APIs are purely additive – the code from the examples won't compile under the current compiler. The only exception is a tuple-based await that compiles but currently uses serial execution, which is not clear or well-documented.

This proposal replaces async let, which can be deprecated. These properties will now have an explicit type instead of an opaque async let.

Potentially Big Nope: the new mental modal for await as a way to unwrap a future (or a boxed future, aka monad) is incompatible with the current behavior of it await being akin a try keyword and adding suspensions points for every async function in a subexpression (skews a bit more on a "compiler magic" side as the runtime can't see subexpressions).

Future Direction

This proposal focuses on a narrow problem – syntax for parallel execution – but it opens a door for a discussion about what other ways of using Async/Await it may enable by representing futures as types and decoupling their execution from suspension points (await). For example, one could define operators for futures that could be applied using dot-notation instead of the current situation that supports only higher-order functions.

10 Likes

I think one of the concerns with doing this implicitly is starting too many tasks at once. If urls has 200 elements you probably don't want to start 200 network requests in parallel at the same time, but rather batch it in smaller chunks.

2 Likes

rather batch it in smaller chunks

In this example, the underlying system (session) manages how much work it can perform in parallel. The await on a sequence expresses the concern of waiting for the completion of all the tasks/futures, regardless of how or when they get scheduled or executed or in which order.

Adding more APIs to control the maximum number of async tasks that can be executed in parallel is a common request on forums. There aren't options available in Swift Concurrency that don't require a bit of boilerplate code, but there are many open-source options: some built on top of (Discarding)TaskGroup and some without it.

I'd like to preface this that the need for simpler parallelism (also known as "TaskGroup is too verbose") is certainly something I'd agree on and something we'd love to improve, however the solution isn't as simple as the post attempts to paint it as.

I don't agree though that mechanical refactoring like proposed: await x; await y -> await (x, y) should be considered to introduce some form of parallelism.

The model doesn't really work in practice though, consider calling methods:

await call(async())
// which is actually
await call(await async())

Swift works with "is the expression covered with an await (or try for that matter: try boom(b: boom()) == try boom(b: try boom())), and that's very engrained into the language.

This is the same with await (a(), b()) == (await a(), await b()) the same general principle applies. So this isn't as trivial as we make it seem here, this would be a fundamental departure from how expressions can be composed in Swift without changing a program's meaning.


I don't think that's a desirable direction or outcome -- the reason we don't return task wrappers everywhere is in order to not have to create any such task to begin with. Calling an async function does not create new tasks - it shouldn't! It just runs in the existing task, rather than create any new resource.

Introducing futures isn't something that's necessary to achieve better structured concurrency ergonomics though. Task<T, E> is really future-like, there's no need to introduce new concepts here IMHO.

I would agree though that we're slowly getting to a point where we can consider introducing a non-escapable child task reference. This would be very helpful in many situations, including getting a handle to a child task from a TaskGroup.addTask (or maybe even a way to do it for async let).


While I don't have a complete design off the top of my head here. There's also even API level things we could do for this request to parallelize execution of a set number of elements which would be easier to call than:

// today
async let a = a()
async let b = b()
_ = await (a, b)

If the goal is just to compute a set number of values in child tasks, we could offer API solutions such as Task.sequence for a set number of values of same type:

// 
var xs: [X] = await Task.sequence(makeX(), makeX())

or maybe if we allowed autoclosures with variadic generics we could do the following for differne types as well as a pure library solution?

// func tuple<each T>(_ expressions: (@autoclosure () async -> T)...) -> (repeat each T)
let x: (X, Y) = await Task.tuple(makeX(), makeY())

Which would be a very easy to write version for what you'd have today write a using async lets or an even bigger taks group.


I'd also think we can simplify task groups themselfes as well if we lean into non-escaping task groups... but would need to think about it some more. That's a topic we should definitely invest in though.


Overall, definitely agreed on the goal of making it easier to get parallelism and creating child tasks!

However I think the road to get there might be different than changing how we interpret expressions like tuples and sequences. I think there's much we can do with "just" API work here, and leaning into new language features that have arrived ever since concurrency was first introduced. Especially non-escaping values may be quite helpful for us in modeling some of the "don't escape the task group" etc.

Perhaps with resource managers (with let group = TaskGroup()) we'd be able to avoid withTaskGroup { ... } nesting as well, if such resource manager could await at the end of its scope etc. There's lots to be done here, but probably in a slightly different approach than reinterpreting what arrays and tuples mean in a program.

A bit of a thoughts-dump, but thope that's useful for further discussion!

15 Likes

This would be really helpful I think.

3 Likes

Every time I touch withTaskGroup have same thoughts if it can be improved, thx for bringing it up and looking forward for discussion!

Will add my two cents:

From code perspective it's not really clear that functions in tuple will be executed in parallel. Imagine having mental model that this is sequential (as everything in Swift) and then suddenly you'll get parallel executions and data inconsistency potentially. Same applies here:

Map is just morphing (or transforming) function, and again nothing says it will perform in parallel rather than sequentially (as you used to). So overall think we should still somehow highlight parallel execution, and @ktoso suggestion with Tasks are actually looks good as a first step:

Guess it's better to have both .sequence and .tuple, though again naming could be confusing (will sequence have sequential or parallel execution? :thinking:), could be more explicit Task.parallel with variadic generic magic.

Another idea I haveβ€”while map is just about transforming, it's what it's mapping over really matters. We have LazySequence in Swift and .lazy extension to execute map and filter lazily, so we can thing about something like ParallelSequence and .parallel extension, which will fire Task.parallel under the hood, something like:

let images = await urls.parallel.map { 
    await session.image(for: $0)
}

Could it be part of async algorithm library? :thinking:

3 Likes

I think we're missing some core operators in the core concurrency library, but we could incubate them in async algorithms.

The tuple method is not implementable in today's swift. We'll need to make parameter packs gain some features to be able to do this, that should be possible though.

The names are just random names I thought of there, we'd think of some more consistent names.

4 Likes

Also thinking it’s a good way to start already.

Yeah, I’m just nitpicking, really. :slightly_smiling_face: Motivation though is afaik we don’t really have any words/keywords for parallel nature in Swift.

Thanks, @ktoso. I appreciate you looking into it. My primary point was, in fact, the lack of easy-to-use core APIs for expressing the parallel execution, which is probably the only kind you need in most apps. The secondary point was the good old "compiler magic vs APIs".

await call(async())
// which is actually
await call(await async())

That is a great point. Just to make sure I understand, it's a language feature, similar to how try works to allow you to use a single await where there are two suspension points.

// a) This works
await fn2(await fn1())

// b) So does this
await fn2(fn1())

I can only speak for myself, but I'm more than happy with the "a" option where you explicitly indicate where the suspension points are. I can't say that it's a desirable feature to be able to skip an await in the "b" option. I would've preferred if the compiler required me to use both await in this example – the additional await makes it more clear. The point of await is to indicate where the suspension points are.

This is another and more common form of boxing/unboxing (monad) that you write daily and that has no special syntax:

if let value = fn1() {
   fn2(value)
}

fn1.map(fn2)

// Some would find this kind of API to be desirable
fn1 >>- fn2

the reason we don't return task wrappers everywhere is in order to not have to create any such task to begin with. Calling an async function does not create new tasks.

It's a fair concern. I don't think we want to create new tasks or introduce any performance regressions. Perhaps the compiler could reason about it and eliminate the overhead in forms like await fn() where the future is immediately unwrapped?

I also realize that Task has some aspects in common with the proposed Future: it starts immediately and caches the result, but I can see room for both. The language already has an opaque async let "type"; it's just that it has a very limited use and requires a keyword. The new type will replace it.

If these APIs were available in the core, it would eliminate a whole lot of paint points. There are precedents in other languages. They seem clear, easy to use, and consistent with each other (unlike async let and TaskGroup).

Map is just morphing (or transforming) function, and again nothing says it will perform in parallel rather than sequentially (as you used to)

@jaleel, it's a fair point that it may not be completely clear. However, I think the suggested mental model provides a simpler way to reason about it. A map is what you describe: it transforms all elements at once. The transform returns futures β†’ futures start executing immediately β†’ parallel. This is a pretty standard model for most implementations of futures that have no compiler assistance to augment how they execute. A more clear example is:

let futures = urls.map { session.image(for: $0) }
let images = await futures
1 Like

I just wanted to add a bit more on try vs await. I can see how the same logic can be applied to both, and especially if you need both. A single try is enough to indicate that something in an expression can throw, and it usually doesn't matter what much as long as you handle errors. In the case of await, do you need to know which subexpressions can suspend? It's harder to tell. The main difference is that the await scenario is significantly less common, and you generally want to spell out the order of suspension points clearly in code. It feels more important, doesn't it? And this is already a very nice syntax for a complex concern:

try await fn1(await fn2())

Also, can the compiler continue supporting this syntax if futures become an explicit type?

await fn1(fn2())
1 Like

Maybe freestanding macro can help? For example

await #parallelAsync {
    await fn1()
    await fn2()
}

And if some data is needed (such as the urls for downloading images):

await #parallelAsync(urls) { url in 
    await session.image(for: url)
}

The macro should be able to automatically generate codes using TaskGroup.

FWIW, here are the results of the two quick polls I ran with some notes. It's a small sample, but this is what I got.

Poll 1: await for tuples

A lot of people seem to expect it to work in parallel. I'd assume many answered "sequential" because they know how it works.

When you declare a single tuple with a single await, you express that you need these N values to proceed forward in the function – the order does not matter to you. The tuples do have order, of course, but it's still a composed single value, and there is only one await, so you could read it both ways.

let v = try await (fn1(), fn2())

It would be a fantastic syntax to use for parallel execution, as it's a very common thing you need in app development. There is already a clear way to express serial execution:

let v1 = try await fn1()
let v2 = try await fn2()

Poll 2: await and subexpressions

Most people gravitated towards the fist option:

await fn2(await fn1())

I agree that it's more clear than leaving one of the await-s out. And it feels perfectly fine if you add throwing functions into the mix:

try await fn2(await fn1())

As for the second option, I wonder if people who chose it weren't sure if you could use await in a subexpression? It seems plausible.

Since the async let version today results in parallel execution (and indeed that's the whole point of async let), IMO it's not really feasible to change the meaning.

Also, for the await (fn1(), fn2()) it's troubling to me that allowing this to execute in parallel would cause programs which might have relied on the ordering guarantee to suddenly admit subtle bugs. Going from unordered to ordered is 'easier' (so long as the ordering you pick was at least possible in the old regime).

I do think it's perhaps worth considering whether allowing await to swallow multiple 'sibling' suspension points is a good idea. In the await f2(f1()) case, the result is 'obvious' insofar as f2 clearly can't execute before f1 finishes. But maybe we should warn on await (f1(), f2()) and encourage users to write it as (await f1(), await f2())? Not sure this would be worth the code churn.

6 Likes

Since the async let version today results in parallel execution (and indeed that's the whole point of async let ), IMO it's not really feasible to change the meaning.

I removed this section from the previous message on async let. It was incorrect – it is parallel. I'm getting confused between these variants.

would cause programs which might have relied on the ordering guarantee to suddenly admit subtle bugs

Yes, it doesn't seem feasible to change it. I can also speak from the experience of updating some tests in one of my frameworks that I tried at least two times to write this:

let v = try await (fn1(), fn2())

To then only realize that the tests fail because they expect parallel execution, and that requires async let:

async let t1 = fn1()
async let t2 = fn2()
let v = try await (t1, t2)

The second option to me look more like serial execution visually than the first one, but YMMW. It makes sense mainly if you know the implementation details (async let as syntax sugar for TaskGroup and the first example with await affecting subexpressions – both things that you likely don't know when you start using Swift Concurrency).

Not sure this would be worth the code churn.

It's probably not, but, at the same, is it OK to allow this syntax to work (fn1 is async)?

let v = await (
    fn1(),
    await fn2()
)

For the sake of experiment would also be interesting to check base variant with something simple, like:

let image1 = try await pipeline.image(for: request1)
let image2 = try await pipeline.image(for: request2)
return (image1, image2)

and what people expect here, cause I have feeling there is a bit of confusion about parallel term.


Think unwrapping is a .flatMap. :slightly_smiling_face:

Just a small demo of implementing this first case using freestanding macro:


The warning of discarding the return value is silent using a dummy result builder

Though it currently does not compile due to data race error of sending the group variable :joy:

1 Like

Actually, is the macro even needed at all here? Could this whole thing be a result builder?

@resultBuilder
enum TaskGroupBuilder<T: Sendable> {
  static func buildExpression(
    _ expr: @autoclosure @Sendable () async -> T
  ) -> (@Sendable () async -> T) {
    return expr
  }
}

func runInParallel<T: Sendable>(
  @TaskGroupBuilder<T> _ body: () -> [@Sendable () async -> T]
) async -> some Sequence<T> {
  return await withTaskGroup(of: T.self) { group in
    for expr in body() {
      group.addTask(expr)
    }
    return await group.reduce(into: []) { $0.append($1) }
  }
}
1 Like

That's a good point! I forgot we can use @autoclosure here, but maybe macro can still be useful if we want the results to be something like a tuple instead of an array

The reason I made the return type some Sequence instead of Array was as not to imply ordering, since TaskGroup gives you the results in the order they were completed, not the order they were added. Allowing tuples also comes with the challenge of supporting heterogeneous result types, which TaskGroups don’t.

1 Like