[Pitch #3] Async Let

Oh, I meant a declaration on the surrounding function, not the awaited one

To put it in more specific terms, it (ab)uses syntax that’s designed for declarative/value-oriented uses and applying it for procedural/side-effect-oriented purposes; in this way, it’s similar to using map instead of forEach (or, as briefly mentioned in a popular ongoing thread, instead of if let).

On one level, this can be seen as an aesthetic objection – map and async let are just operations, and you can use them in any way that rests on their operational semantics.

But at another level, it runs against their more abstract semantics: map expresses a meaning of “transform the value[s] in this container”, and async let expresses a meaning of “I will want the result of this async computation later”.

Thinking about code at this higher level of abstraction allows for accurate reasoning without the pedantic precision of thinking about the full operational semantics of a building block, as long as those building blocks are used in consistent ways. Finding print() in a map closure, or seeing that the result of async let is never used, breaks the abstract model and requires the reader to reanalyze the code from a completely different viewpoint.

async let _ is nonsensical, or paradoxical, because it exists only to violate the high-level meaning of async let. If you want a side-effecting child task to run in parallel with the main body of your function, there’s already a procedural pattern for that: task groups.

I’ll also note that the pitch’s motivation section is centred on statements like: “this example showcases the weaknesses of the TaskGroups very well: heterogeneous result processing and variable initialization become very boilerplate heavy” and “This dataflow pattern from child tasks to parent is very common, and we want to make it as lightweight and safe as possible.” This motivation does not support async let _.

2 Likes

My thinking was that the variable in the async let isn't the result of the expression on the right of the =, it's placeholder for it (effectively a structured future). The result of that expression is only obtained when that variable is later awaited. In synchronous Swift, if you want to ignore the result of a function call, you have to either do it explicitly (_ = functionCall()) or mark the function with @discardableResult. With async let, if you don't ever await the variable, you're effectively ignoring the return value, so it would be consistent to treat it in the same way.

func syncFoo() -> Int { 1 }
@discardableResult func discardableSyncFoo() -> Int { 1 }
func asyncFoo() async -> Int { 1 }
@discardableResult func discardableAyncFoo() async -> Int { 1 }

_ = syncFoo() // OK
syncFoo() // warning
discardableSyncFoo() // OK
async let bar = asyncFoo()
async let baz = discardableAyncFoo()
_ = await bar // Warning if this isn't done
_ = await baz // OK if this isn't done

I'm very happy to see this moving forward since IMO is one of the concurrency parts that will have the most impact in our code. I love the async let naming since it makes it very easy to undersand and refactor. You can go from a sequential code to parellell execution just by moving from = await to async let = the slmetry is very nice.

One thing I'm not clear on is the automatic await at the end of scope. I understand the need of it to maintain structured concurrency (which I'm a big fan off) but the text confuses me a bit.

the scope in which it was declared exits, will be awaited on implicitly.

func go() async { 
  async let f = fast() // 300ms
  async let s = slow() // 3seconds
  return "nevermind..."
  // implicitly: cancels f
  // implicitly: cancels s
  // implicitly: await f
  // implicitly: await s
}

the go() function will always take at least 3 seconds to execute

the bit I don't get is why if the task is being cancelled we are still awaiting and even for the entire 3s? Is it because cancellation is cooperative and we assume that in this example slow never checks for isCancelled? If that's the case, does it mean that if slow checked for isCancelled (let's say half way trough) the total time would get reduced (to 1.5s)?

About the explicit await because of throwing, sadly I don't have a clear opinion. I think is fine as it is but is hard to know without experience, and the examples make a good point.

I also want to note that I'm very much in favour of the future direction " Await in closure capture list". I agree is not required for this proposal to get it, but if we start seeing this pattern in the wild it will get very annoying very quickly.

My main concern is that it is not possible to refactor a function which uses async let and break it down into smaller functions.

It is also on purpose, and unlike Task.Handles and futures that it is not possible to pass a "still being computed" value to another function.

It is not 100% clear what is the mentioned purpose here. I understand the problem with escaping futures, but the problem here is in escaping, not in futures. Swift already has escape analysis for closures. How hard would it be to extend it to other types?

Actually, it is not impossible to refactor strictly speaking. You can use closures as a workaround:

func makeDinner() async throws -> Meal {
  async let veggies = chopVegetables()
  async let meat = marinateMeat()
  return try await cookDish(veggies: { await veggies }, meat: { await meat })
}

func cookDish(veggies: () async -> [Vegetable], meat: () -> async Meat) -> async throws Meal {
    async let oven = preheatOven(temperature: 350)
    let dish = Dish(ingredients: await [try veggies(), meat()])
    return try await oven.cook(dish, duration: .hours(3))
}

But that's ugly. I'd like to have non-escaping futures as a first class-citizen:

func makeDinner() async throws -> Meal {
  let veggies = async { await chopVegetables() }
  let meat = async { await marinateMeat() }
  return try await cookDish(veggies: veggies, meat: meat)
}

/// async throws [Vegetable] is a type and it is non-escaping
func cookDish(veggies: async throws [Vegetable], meat: async Meat) -> async throws Meal {
    async let oven = preheatOven(temperature: 350)
    let dish = Dish(ingredients: await [try veggies, meat])
    return try await oven.cook(dish, duration: .hours(3))
}
2 Likes

I’m confused about this range of questions too.

Quincey, I took “detached” to mean “not a child of the task group at hand,” i.e. a detached task can outlive its calling context, regardless of which executor it runs on. (Proposal authors, please correct me if I’m wrong on any of this….) Under that definition, looking at Quincey’s list:

  1. group.async { }: task is a child of group, thus not detached
  2. async let: child, because task cannot outlive the enclosing scope
  3. asyncDetached let: I think this is what’s not in the pitch at hand? Presumably this syntax would mean the created task can outlive the enclosing scope. But seems contrary to the philosophy of the proposal; structured concurrency is the norm…and what does this buy us that 5 doesn’t?
  4. async {}: “neither a child task, nor a detached task” Still confused about this one, despite @ktoso’s patient attempt to explain in the conversation linked there.
  5. asyncDetached {}: clearly not a child; that’s the point

To Quincey’s larger point, there is a long list of questions every time we initiate concurrency:

  • Am I spawning something in parallel vs. queueing a task’s parts in the current (possibly entirely sequential) executor?
  • Am I jumping outside of actor isolation in some dangerous way?
  • Am I creating a child of the currently executing task?
  • Am I inheriting priority?
  • etc

I at least don’t yet have a clear big-picture mental model of how to navigate these questions. It would be helpful to update the big picture in the Concurrency Roadmap to capture all the progress in these individual proposals.

Sorry, on this point, I was a bit unclear on what I was objecting to. I think "detached" as originally pitched was intended to mean "not a child task", in contrast to Task groups. But we've now ended up with this:

That is, both async { } and asyncDetached { } in the current pitch do not spawn child tasks. Therefore, to a non-expert user of Swift, the word "detached" is likely to be understood as the actual difference between these two constructs: which executor the spawned non-child tasks run on.

I think spawn describes much better what's happening and prevents muddying the meaning of 'async'. May I propose the following:

Child tasks

Everywhere a child task is created, we use spawn to create it:

// Spawn and get the result later (currently 'async let'):
let veggies = spawn chopVegetables()

// Just spawn some work (fire-and-forget):
spawn {
  doSomething()
}

// And group of course:
group.spawn {
  CookingTask.veggies(await chopVegetables())
}

Non-child tasks

For all non-child tasks (tasks that can outlive the scope) we can use dispatch:

// With inheritance of the context (currently 'async {...}'):
dispatch {
  doSomething()
}

// Without inheritance of the context (currently 'detach {...}'):
dispatch(.detached) {
  doSomething()
}

Canceling child tasks on return

Using spawn without being able to await the result (fire-and-forget) is a problem, since the work will just be canceled at the end of the scope. We could say that just these kind of spawns will not be canceled, but I wonder if the following will provide a safer and more predictable system in general:

Child tasks will only be automatically canceled if the scope exits with an error

This makes sure that all work a function has started is always executed before it returns normally. I can imagine that most functions don't require cancellation for the happy path. If that is true, it provides a nice consistent safe default. For the cases that do need it, we could use the following:

let veggies = spawn(autoCancel) chopVegetables()

The current semantics is that child tasks do not "inherit" the actor in which they were created. Rather, they go onto the global executor so they can execute concurrently with the body that created them (here, it's doStuff). This is the default because the point of async let is to introduce concurrency.

One important thing your example is missing is the declaration of doOtherStuff(). If it's also part of the @MainActor, e.g., you have code like this:

@MainActor func doOtherStuff() { ... }

then that code will end up running on the main actor anyway. If it's not part of the main actor:

func doOtherStuff() { ... } // not part of an actor at all!

then it runs concurrently.

One important thing to internalize about the actor model is that a declaration knows what actor it needs to run on, and Swift makes sure that happens. The place where you initiate a call to the declaration is far less important.

async let doesn't inherit actor context because doing so would introduce unnecessarily serialization on the actor, and the actor model already ensures that you'll come back to the actor when needed.

Here's another instance of this same question/sentiment:

async let is the deliberate choice to introduce concurrency. Immediately eliminating that concurrency by having the child task go back to the main actor---even if you don't need it---would undo the benefits. We'll require a hop back to the main actor when you use anything that needs to be on the main actor.

FWIW, for (instance) actors this feature was part of the SE-0313 review, but has been subsetted out of that proposal due to concerns about how the feature behaved. It could come back later, of course.

Doug

2 Likes

I think the unused-variable warning suffices for this. We probably don't need to anything special.

That's an interesting point! I see that Konrad might disagree with me, but it seems to me like early cancellation is desirable, i.e., we should notify the child task that it's been cancelled the moment we know that we don't care about the result, so it can finish up quickly. But the actual waiting-until-completion should happen at the end of the scope. Together, this reduces the likelihood of us waiting on work that we know is unnecessary.

I think this says that the example from

would be immediately cancelled. Personally, I'm fine with requiring that an async let have at least one named variable in it, and awaiting that variable is the way you specify that you care that the task run to completion without getting cancelled. My thoughts here are heavily influenced by the above notion that our defaults should skew toward not wasting work in a concurrent context.

Doug

7 Likes

Two comments about this:

  • First, async let puts the modifier onto the let declaration because it's the declared variables themselves that are special and require await. Putting the indicator to the right of the = doesn't mean the same thing. I think you'd need to keep spawn let in this naming.
  • Second, AFAICT the middle one you're adding is a new notion of "child task" that is somewhere between async let and group.async. Presumably it runs until the end of the current scope, but is only implicitly cancelled if the the scope is exited via a thrown error. I think we'd need to talk through some very strong use cases before adding a 5th construct here.

"dispatch" is interesting. As a verb, it does have the right connotations with "go do this elsewhere", which is right for a non-child task. I'm having a hard time deciding whether it lining up with the library name "Dispatch" is brilliant or a liability ;)

We opted not to use the .detached argument, and instead went for the compound name asyncDetached, because the "detached" version is a separate function overload (with different actor-isolation behavior on the closure).

Doug

I'm not totally against this notion, but it's a significant expansion of the async let feature that isn't strictly neede, and I'd rather we separate out the discussion unless the proposed async let somehow doesn't work with this as a future direction.

Doug

1 Like

You've made this response to me in the past, but in the current discussion my immediate concern is the inconsistency.

Despite your argument (in the quote), you were motivated to introduce async { } at this late stage of the concurrency design, presumably because you were convinced there's a real need for task spawning that does inherit actor context.

In that case, I don't see why there isn't an equivalent need for a context-inheriting variant of async let. Your argument that it isn't necessary is rather directly subverted by the addition of async { } to the design.

Asynchronous code should in most cases be asynchronous because of things like I/O. In those cases you don't want to thread-swap unless you have to. So consider the premise here:

If I call an async function that is I/O bound I want it to use the thread I'm on for as long as possible because that's the most efficient thing way for it to run. So this function:

func getStuffFromDisk() async -> Stuff;

should be callable from the same thread until the point at which it needs to suspend to wait for the disk. If it so happens that the data is already cached and immediately available then it may not need to block at all and can complete synchronously without ever thread-swapping. All of that applies even if I'm on the main thread to start with.

I feel like the concept of an asynchronous function being itself tied to a particular executor by declaration is putting the cart before the horse. The work should happen where the caller is by default. I don't have a problem with overriding that (and I suppose Actors always do), but if you can't have an asynchronous function that inherits the calling context by default then you're going to make running some async code unnecessarily inefficient.

2 Likes

I chose let x = spawn ..., because I understood that spawn let was considered and rejected. To be honest, I'm perfectly fine with:

spawn let veggies = chopVegetables()
spawn { doSomething() }
spawn let foo = { doSomething(); return 10 }

Edit:
Thinking some more about it, the last one of the three doesn't work without adding some parentheses to specifically call the closure, so perhaps let x = spawn ... provides a more consistent story:

let veggies = spawn chopVegetables()
spawn { doSomething() }
let foo = spawn { doSomething(); return 10 }

I think we can see spawn as a special case that initializes the variable later. Kind of similar to an uninitialized variable.
/Edit

I think this is the fire-and-forget option @ktoso was looking for and named send earlier. I'm not sure though.

I think it could make things easy to teach: "Use spawn, unless you have a reason to break with structured concurrency". However, perhaps in practice dispatch {...} is more often the better choice?

Personally, I don't have much of a history with "Dispatch", so it's hard for me to say, but I do understand the issue.

I see. I guess dispatchDetached {...} works as well.

Overall, pretty nice improvement.

Now, we finally unify
Scoped - child task

  • TaskGroup.async, async let,

Attached - sibling task

  • async {},

Detached - separated task

  • asyncDetached {}

into consistent async style family. Well done~

As for async let _ = ... "fire and forget" task, async(fire a new task) and _(forget the result) looks meaningful and reasonable at least in semantic way. Maybe it's a rare usage case but in the meantime has its unique value in some cases.

2 Likes

Reading this makes me realize "child task" would be better called "scoped task". It communicates the meaning much more effectively. To me "child" doesn't really mean "scoped" except in the discussion we have here.

To me, "child" just mean the task has a parent, that it's not a "root" task (whatever a root task is). I might be working with trees too much.

9 Likes

I understand this is now all but set in stone, but still it would be awesome to use "dispatch" to mean "non-scoped task", like:

I think "dispatch" is superior to "async" for this, but it's hard to beat "async let" as "an async way to introduce a new value"...

1 Like

I thought about it some more: I don’t mind these semantics. I guess we could live with it. It might give some weird behaviors while developing a function that’s not “complete” yet (I started tasks but didn’t write code using them yet), but one can argue it indeed is predictable then.

So yeah this is ok I think - let's cancel as soon as variable is not going to be used anymore.

2 Likes

Hi folks,
the thread has a lot of names for tasks being thrown around and perhaps not everyone is up to date what the implementation details and implications of some those names are.

I'll try to clarify the general terms and capabilities, without going into naming of operations:

  • child tasks - can only be created in a context that is already an async function (has a task), and must be enforced a group or by async let: are child tasks and these are the only APIs available to create those, what this means:
    • a child task can only be created from a context that is in a task already - i.e. it must be in an async function.
    • the current task becomes the parent task of the new task
    • (important) if the parent task is cancelled, all child tasks are also cancelled, this is a key difference between child tasks and any other task.
    • child tasks must not and cannot (it will be enforced by the APIs syntactically) out-live their parent.
    • this allows them to be efficiently allocated in the parent's local allocator
  • un-structured tasks - specifically, async{} and detached tasks
    • can out-live the parent task
    • do not keep any references to parent; thus, cannot participate in cancellation propagation from parent
    • cannot be allocated by parent and thus must use the global allocation mechanism (slightly heavier)
    • may or may not "inherit" values but must do so by copying them (slightly heavier than a child task)
    • async{} tasks
      • inherit values, priority, execution context
    • detached tasks
      • do not inherit anything, are really detached from any context

So that's really the only categories of tasks we have.

This is about their "structure" mostly - structured or not. The discussions about what executor should be default where, and how to customize it does not really impact the division of task types in that sense.

--

Specifically, I don't think there is any "attached" counterpart to detached, or rather, that is what child tasks are -- the complete opposite of a detached task.

Please do not confuse this with saying that there cannot be a child task that is on the same executor as the enclosing context. But it's still a child task. I think that with custom executors we'd likely solve it then.

I would not want to call the async{} created task an attached task, that sends the wrong message about it's semantics: notably notice that it is not attached to the parent, you cannot cancel implicitly it by just cancelling the task where it was spawned from -- you could install a cancellation handler to do this though.

I don't think it's right to invent a new "attached" notion, or rather "attached" if anything would be a child-task. I think it is somewhat over-promising what it actually is to call async{} an "attached task" so let's not keep using that term. It is not attached because while yes it inherits some values, it does so very differently than child tasks: it copies values, can out live parents and is not attached to cancellation of parent or anything really. It is like detach but "copy over useful properties and inherit values etc".

13 Likes