[Pitch #2] Structured Concurrency

That’s more of a question for the task locals pitch semantics, but let’s quickly address it...

It isn’t about the task, it is about the specific fields and their storage.

As I mentioned, reading canceled status is safe since it’s just an atomic load of an atomic status integer in the task. Other more complex operations on the task may not be safe — that’s it.

Access to task-locals outside of the task is racy semantically (if it is executing, and entering/exiting withLocal blocks, those are mutating the bound values stack — so it is racy what value you’d observe from outside of the task), as well as “weird” (why the hell would you do that) so we don’t allow for it.

Sure, it can be made thread-safe (by using a treiber stack (simple lock free stack)), but we don’t have to since logically we don’t want to allow external access anyway, so why would we. It does not make sense to read them outside the task from arbitrary other tasks — so we can avoid synchronization there, making accesses to the locals cheaper. Perhaps we’ll do the lockfree stack anyway, but it does not make sense to me to allow “arbitrary random task can read your task locals” and we should not do this.

For the implementation on how it’s stored, you can refer to the task locals PR which contains a full implementation along with many docs explaining it.

1 Like

I also wanted to highlight another reason why I thought in the above snippet that the APIs are mirrored in the static and instance.

I may be wrong here... Let me know if my concern is moot, because we would @inline(__always_) static func current() and that would actually be enough to help the compiler optimize this even cross module? (i.e. I’m not sure about TLB cost can be optimized away well or not — do you know @Douglas_Gregor? My prior experience on the JVM and reading up some Intel manuals had me worried here a bit.)

I think it may be beneficial to have the APIs mirrored on the Task instance and statically, because it means we can avoid hitting the thread-local in which a task would be stored in tight loops, e.g.:

func a() async { 
  while true { 
    if Task.isCanceled { return }
  }
}

would need to:

  • read the task thread local (which assumes ABI will be such that the task is in a thread local),
  • read the atomic status and check cancellation

Since we’re always going to be “back” on the same task here, it would make sense to:


func a() async { 
  let task = /* await? */ Task.current()
  while true { 
    if task.isCanceled { return }
  }
}

which only does the atomic op, without having to go through the thread local...

So... Thread locals can be very fast, but if they’re accessed super frequently like cancellation in tight loops might, it can become an issue (reference, at least for intel chips), thus I think the fact that isCanceled would exist as static func and instance func both — one for rare use, and the secondary for “if in tight loop, get the task once and then reuse it to check cancelation” for small performance optimizations like that.

Same as the usual advice:

You may ask if the function “get_value” cannot be inlined for whatever reason, is it possible to reduce the cost of accessing a thread-local variable? The answer is “yes”. Since in this example, the thread-local variable is read-only, you can assign the thread-local variable to a local variable outside the “for” loop, and then use the local variable inside the loop, as shown below. [from above intel docs link]

Tho at the same time there’s

TL;DR (linux writeup)

If you have statically linked code, you can use a thread local variable like another one: there is only one TLS block and its entire data is known at link-time, which allows the pre-calculation of all offsets.

If you are using dynamic modules , there will be a lookup each time you access the thread local variable, but you have some options in order to avoid it in speed-critical code.

So... given Swift and it’s compilation model, I’m not super sure if I should be worried or not — does force inlining solve the concern or not? :slight_smile:

It’s been a while since I read it, but my understanding of the task-local storage pitch was that semantically, task-locals are immutable and each withLocal block is a new nested task. If this is the case, observing the locals of another task would not be semantically racy, and the problems you describe would only arise in observing the state of an executor.

That would be too expensive and the core team suggested we implement it in a different way — and I agree, and it is implemented more efficiently already, by not creating new tasks (which are much heavier than just adding 1 pointer to a stack).

Regardless if it can be made work even at high performance costs (which imho disqualify such solution anyway, task locals MUST be really fast), it is semantically weird and nonsensical to allow other tasks access another tasks local state. That’s the meaning of task local state — no-one else may access it.

I had a comment and a question about using variables declared with async let.

I understand why try and await are needed when the variables are used in the example code since they are potential suspension points.

func makeDinner() async throws -> Meal {
  async let veggies = chopVegetables()
  async let meat = marinateMeat()
  async let oven = preheatOven(temperature: 350)

  let dish = Dish(ingredients: await [try veggies, meat])
  return try await oven.cook(dish, duration: .hours(3))
}

But to my eye, this has the potential of cluttering the call site. In the provided example there is only one argument, but if multiple arguments were async let the readability of the call site would suffer due to the number of try and await keywords.

At least for me, the clarity of what is happening in the call becomes diluted by all the information of what is being waiting for and whether those things can throw:

let dish = Dish(ingredients: await [try veggies, meat], cookware: await pan, secretIngredient: try await sauce)

From the Detailed design section about async let it appears that you can explicitly await using code like this:

async let veggies = chopVegetables()

try await veggies

My question is, once you have awaited for the value in this way, are the try and await keywords required for subsequent usage, since the value is already returned without error?

Would the following be valid code?

func makeDinner() async throws -> Meal {
   async let veggies = chopVegetables()
    async let meat = marinateMeat()
    async let oven = preheatOven(temperature: 350)
   
    await meat
    try await veggies

    let dish = Dish(ingredients: [veggies, meat])

    try await oven
    return oven.cook(dish, duration: .hours(3))
}

If so, then the call sites could be simplified if a developer so desired.

The more general question would be once an async let variable has been awaited on in any fashion, do subsequent uses require the await or try await keywords?

I may be reading something into the proposal that isn't there, since this doesn't seem to be explicitly mentioned.

2 Likes

I disagree. Very often asynchronous work is I/O bound, and what you want to do is initiate the work (start the I/O) and then do something else (perhaps starting additional asynchronous I/O) before waiting for the results. The concurrency comes from the fact that the "work" being performed happens outside of your process (perhaps even outside of the CPU entirely).

The current design means that if you're initiating that kind of work on a single-threaded executor then you are forced to either wait for each result immediately or forced to use more complex syntax (which as far as I can tell has not even yet been proposed). That makes this syntax both harder to use and harder to learn.

I believe that I/O-bound asynchronous APIs are and should be more common than CPU-bound asynchronous APIs, which means this pattern of starting working that doesn't actually use any CPU time is going to be common. It should be easy to do that from the main thread. That, in my mind, is one of the huge benefits of async/await. Simplifying the complexity of UI code that starts and waits for asynchronous results is a huge win for async/await, but this design makes that win much smaller.

That's not what I said, though. Here's an example to illustrate:

func fAsync() async {...}
func fSync() {...}

func caller() async {
    async let t1 = fAsync(); // okay
    async let t2 = fSync(); // should be an error
}

None of that has to do with whether you used await in fAsync or on the same line as async let. The difference is that the expression in the first line could have await instead, whereas the expression in the second line could not. So if the expression itself has no asynchronous call (therefore it could not contain an await) then the assignment to an async let variable should be disallowed.

My argument is that the async let syntax should be reserved for calls to async functions, and that it shouldn't change in any way how that async function gets called (i.e., which executor it uses or which thread context it is initially called on). It should only allow for separation of the initial call from the suspension point. It should not be used to introduce multithreading. If you want to run some synchronous CPU-bound function concurrently then you should have to use an API for that. That API may use an async function to allow you to await it, but it should be an API or at least a syntax that has a clear scope.

In my experience in both using async/await with UI code in C# and in helping other people to learn how to use async/await in C#, I believe strongly that blurring the lines between "waiting for asynchronous results" and "introducing multithreading" is going to make the feature both harder to use and harder to learn. It makes it difficult for people to form a clear mental model of what async/await actually does. It's much easier to understand and use when it consistently means "a new syntax for waiting for the results of asynchronous results" and not "a new way to implement multithreaded code". We have a strong need for the former, but I'm not sure we have much need at all for the latter.

7 Likes

Note though that non-async function are implicitly converted to async. Differentiate the two in this manner would have to be an exception to that rule.

1 Like

If you consider that an exception then we already have an exception in that this would be disallowed:

func f() {...}

func caller() async {
    await f(); // error
}

You could wrap it like this to work around the error:

func caller() async {
    let fWrapped: () -> Void async = f
    await fWrapped() // okay
}

If you did that then the call to fWrapped (and thus f) would complete synchronously in the same stack as caller.

Likewise, if you did the same with async let:

func caller() async {
    let fWrapped: () -> Void async = f
    async let t = fWrapped()
    await t
}

Then the behavior should be the same: the call to fWrapped and thus f should complete synchronously in the same stack as caller.

You can place a single try await at the beginning if there are no async closures involved. It's one of the benefits of not having a Future-like type as building block for asynchronicity:

let dish = try await Dish(ingredients: [veggies, meat], cookware: pan, secretIngredient: sauce)

At first sight it seems a clear and intuitive design. I have a few considerations, though:

  • await meat alone would give a warning of unused value, so you would be forced to use _ = await meat. The same applies for the other try await isolated expressions;
  • you can await a value without realizing it, e.g. by using await print(meat), would that count as awaited too?;
  • if you move await meat on a different line you may need to move the await somewhere else.

That would certainly be less cluttered. The code example in the proposal led me to believe the individual arguments would each need to be annotated. (Why wouldn't the spelling you mention almost always be preferred to interweaving the keywords with the arguments?)

The proposal includes the example:

{
  async
let 
    ok = "ok",
    (yay, nay) = ("yay", throw Nay())
  
  await ok
  try await yay
  // okay
}

which led me to believe that await ok or try await yay in the code example would not cause a warning. The proposal doesn't explicitly say one way or another.

I would imagine so, since it needs to be be present to print it.

I was thinking of await meat sort of like guard let item = optionalItem. If I move a reference to meat or item before the appropriate line, I'd need to move where I await or unwrap.

But for me at the moment, it's less about whether it should or shouldn't work this way, and more about clarifying how these things are currently intended to behave.

If async let is used in a UIActor context, would the spawned task still be run concurrently? Or does this depend on whether the initialiser is also annotated with ‘UIActor`?

Eg.

func load1() async -> String {
  await loadUrl(textField1.text)
}

func load2() async -> String {
  await loadUrl(textField2.text)
}

@asyncHandler
func buttonPressed() async {
  async let res1 = load1()
  async let res2 = load2()
  textField3.text = await “\(res1) \(res2)”
}

I'm not sure if you're asking me or the pitch's authors. My understanding is that as currently written the answer is that it depends on how the function (or class?) is annotated.

My argument is that it should not make a difference. Calling load1 or load2 directly with await or by using async let should make no difference in terms of which call stack is initially used by the call (before the suspension point) or which executor is responsible for the task.

Your example shows a reason why that's so important: your functions both use UI objects and therefore must run on the UI thread. If changing their caller from await to async let changes the thread that they run on (even if the caller remains on the UI thread) then it's trivial to introduce a bug in your application. That would make this feature really painful to use. The default behavior of async/await should as much as possible be to make code like this Just Work. Refactoring to change where the await happens by using async let should not break things.

4 Likes

In other words, does async let support a form of definitive initialization?

(I've been curious about this as well.)

1 Like

About async let

Let's say we'd like to clear two stores concurrently.

@UIActor func logout() async {
    async let clearCache = App.currentAccount?.clearCacheStore() //line A
    async let clearData = App.currentAccount?.clearDataStore() //line B
    await [clearCache, clearData] //line C
    App.currentAccount = nil
}

async let makes the whole right hand expression a child task. So App.currentAccount may be accessed on another thread, and the complier will complain about possible data races here, right?


Solution A:

Original code with currentAccount annotated with @UIActor:

  • [line A, B] Two child tasks are formed.
  • [line A, B] The child tasks cannot complete, because they need to access currentAccount with an UI executor so the child tasks are suspended.
  • [line C] The logout function gives up control and awaits for child tasks to complete. This gives the child tasks a chance to get currentAccount with the UI executor and finish their remaining operations.

Solution B:

Tweak the async let declaration:

@UIActor func logout() async {
    let currentAccount = App.currentAccount //line A
    async let clearCache = currentAccount?.clearCacheStore() //line B
    async let clearData = currentAccount?.clearDataStore() //line C
    await [clearCache, clearData] //line D
    App.currentAccount = nil
}
  • [line A] The currentAccount can be directly accessed because we are on a UI thread. Event if currentAccount is annotated with @UIActor.
  • [line B, C] Two child tasks are formed and executed concurrently.
  • [line D] Await for child tasks to complete.

Solution B looks more efficient that Solution A, is that right?

This makes me a little uncomfortable because a small difference in async let declaration makes a total different execution path.

Global function could be quite confusing for old-school asynchrony. Something as innocent as queue hopping can already be surprising:

Task.unsafeCurrentTask // Task 1

DispatchQueue.main.async {
  Task.unsafeCurrentTask // nil??
}

The same goes for just about any closure that escapes, which would be common here. It can also be hard to spot given that Swift's trailing closure is meant to mimic control-flow syntaxes.


I still think that having a checked Task instance would be the best approach. We need to freeze the Task (a.k.a suspend) for the Task instance to be valid so likely this goes hand-in-hand with task continuation (syntax TBD):

Task.isCancelled/local // 1
withContinuation { continuation, task in
  task.isCancelled/local // same as 1

  DispatchQueue.main.async {
    task.isCancelled/local // same as 1

    task.resume(...)

    task.isCancelled/local // error
  }

  task.isCancelled/local // race
}

I think this could work with the current implementation without performance compromise. I could be wrong, though. :thinking:


That should be a warning.

An await operand may also have no potential suspension points, which will result in a warning from the Swift compiler, following the precedent of try expressions:

An additional point I would like to raise in relation to async let (and presumably also task groups, considering that async let is further described as sugar to task groups) implicitly changing the executor is that this appears contrary to one of the intended use-cases outlined in the Task Local Values pitch, namely that the executor of a task be determined by a task local value. As described there, child tasks are to inherit their parent's task locals unless they are explicitly rebound to different values for that child task; therefore child tasks should also inherit the parent's executor unless it is explicitly rebound. The implicit executor-changing behaviour would violate this; not exactly an unreasonable thing, but more surprising than the alternative, in my opinion.

It is unclear to me at what point the concurrency with a new child task actually starts. What is the default behavior and how to customize it, for exemple starting a child task in a specific queue.

My current vision is the following, but I’m not sure it is what the pitch says. Looking especially at the async let statement, the current task would let the child task start and suspend once before actually proceeding to the following statement.

In other words when an async let statement runs, a child task is created and the current task is suspended until said child task first suspends itself. No parallelism by default.

It works well with UrlRequests and other IO, that will suspend quite quickly and do not really benefit from hopping queues. (It might even be counter productive regarding run loops).

I think it also works well with worker task, or wanting a specific queue, with the help of some queue-hopping function (standard or custom).

// some IO that doesn’t need to hop queues
async let updatedContact = webService.getContact(id: id)
// some heavy work that needs to be in a specific queue
async let gatheredData = doInBackground {gatherLocalData(contactId: id)}
1 Like

I see that there is a 3rd draft for the structured concurrency proposal posted a ~week ago. Is there a discussion thread for it that I'm not seeing, or would you like to continue this thread that quiesced? I'll take a pass through it when I have time and want to send any feedback to the most appropriate place. Thx

-Chris

1 Like

We have more revisions coming based on the discussions in this thread, which should be ready in a proper 3rd revision within a week. It might be better to wait for that.

There’s a revised actors pitch that’s fresh, though.

3 Likes

Awesome, I'll wait then. I'm working my way through the actors proposal and will provide feedback tonight - it's a huge step forward, congrats!

2 Likes
Terms of Service

Privacy Policy

Cookie Policy