[Concurrency] Structured concurrency

Fabio_Kaminski · November 5, 2020, 6:42pm

So if im understanding this correctly this will basically delegate the real implementation of async/await to a 'Task' protocol?

This was all i was asking on the other threads about concurrency: That it should not be a "black box" and that the implementation could be customized for different realities.

(In some cases this would be avoiding the actors interface completely)

So if we have basic building blocks, like a Task protocol, that will in the end define what will really happen with async/await and things like an Actor stack which is a higher level construction is built on top of it, but that will not be the only higher-level concurrency paradigm forced on everybody, this is what i was actually asking for.

John_McCall · November 5, 2020, 7:41pm

I'm not sure what you're referring to, but in the proposal there is no Task protocol, and the customization you can do is limited to changing how tasks are scheduled on an executor.

Fabio_Kaminski · November 5, 2020, 8:34pm

Yes, i didn't see any Task protocol, i just assumed it would be something like it from the extensions to it that were in the example.

But what about the 'executor' or the task scheduler. It will be possible to define the underlying implementation and change from the standard implementation?

Edit: Just to give more context about this. I have an actual implementation that it already happens to be multi-threaded with thread pools and the like. Sometime it will also stick with the threading paradigm of a C++ runtime, that already has its own thing (which is mostly WebKit, and by being that i need to have some sort of control to when i will dispatch to "media thread" or "compositor thread" or "main thread", but that can also be launched on any thread that is sitting idle in the thread pool)

What i have here for instance, you can use the Swift side of threading and sometimes even stick to threads that are controlled on the C++ side.

The implementations of C++ and Swift are basically the same. So i would not want to have to tell people that would develop for it, that they cannot use the Swift concurrency, because there's no way for me to adapt to whatever is being decided here (and a async/await would be awesome to use) because the things is "burned" into the language. (Thats what im calling black-box here).

And unfortunatelly the way is already designed here, it gives a lot of control and the possibility to have a multi-thread environment with low latency scheduling, more optimized for media, 3d and the like (where thread affinity and zero copy matters a lot).

By the way the whole thread pool is shared. So i imagine that in my case there will be threads being launched from more than one thread pool, with a process having IDK 20 threads being launched because they are from different sources and implementations.

So in my case i will just have to say that the native concurrency model of Swift cannot be used, and i think it will be a pitty, giving i think there's a way to do this with a little more control over how the real implementation will roll.

PS: If this is the case, dont forget to create a compiler directive to disable threading, so this thing wont be sitting idle with threads eating resources that in my case wont be used.

John_McCall · November 5, 2020, 9:10pm

The implementation is designed to just run on ordinary threads, there's no dark magic that can't work on top of an existing thread pool. We don't want a Swift-specific thread pool in general because we do understand that thread pools work better if you have a single pool making decisions holistically for the process. On Darwin, we'll be sitting on top of Dispatch's thread pool. On other platforms it's less certain what we'll do — as a project we'll probably sit on top of Dispatch again, but it should be straightforward to switch to a different underlying pool, at worst by modifying a few places in code. In any case, if all you care about is using a different global thread pool, I don't think there's any inherent reason you won't be able to use Swift concurrency.

Fabio_Kaminski · November 5, 2020, 9:50pm

but it should be straightforward to switch to a different underlying pool, at worst by modifying a few places in code. In any case, if all you care about is using a different global thread pool, I don't think there's any inherent reason you won't be able to use Swift concurrency.

I'm glad to hear this.

It's just that i didn't found anything about the implementation to reason about it. That's why i was guessing about 'Task' to understand more about the specifics of the inner implementation of it.

I don't think there's any inherent reason you won't be able to use Swift concurrency.

So i can read this as, it will be doable to change the underlying implementation if there's a need, and yet be able to use the concurrency primitives of the language?

John_McCall · November 5, 2020, 10:05pm

If by "implementation" you mean "what threads things will be run on", yes, I think we want that to be manageable.

Paulo_Faria · November 6, 2020, 2:06am

Does that include green threads?

Lantua · November 6, 2020, 2:12am

Rather, if I want a Task2 library (or just Task2.runDetach), how should I go about implementing it? I don't see any entry point that I can utilize.

Maybe I could make a custom actor and run on that, but I'm still confused about which one is the more foundational of the two.

John_McCall · November 6, 2020, 2:28am

The standard Swift implementation generally follows the C function-call ABI with some minor platform-specific adjustments. Swift concurrency will on some level simply be splitting async functions into function fragments which individually still work as C ABI functions, again with some minor platform-specific adjustments. So any userland thread library that follows that ABI should be perfectly capable of host Swift async functions. That is the great benefit of using function-splitting rather than some alternative implementation technique like a completely custom thread model.

John_McCall · November 6, 2020, 2:33am

Task.runDetached(fn) is fundamentally just creating a standalone task object and then submitting a partial task that starts running fn to some appropriate executor, depending on what fn does. (If fn obviously wants to start on some actor, the task will start there; if not, or Swift can't statically figure it out, it'll submit the task to a generic global thread pool.) There's nothing interesting there to customize. Customizing executors — e.g. an actor so that e.g. it runs operations using its own serial queue implementation or a dedicated thread or something like that — is much more likely to be what you want.

ktoso · November 6, 2020, 3:12am

I think our example in that section actually does not explain this very well.

Quoting that bit:

One of the variables for a given async let must be awaited at least once along all execution paths (that don't throw an error) before it goes out of scope.

Okey, I cleared up my confusion and also confirmed the implementation reflects it with @Douglas_Gregor... So here's what it means:

  async let (one, two) = (1, 2)

So we're saying that this is effectively:

  async let (one, two) = { /*this is async "together"*/ (1, 2) }

The one and two are executed as one wrapping closure; and as such complete atomically together:

  async let (one, two) = (1, 2)
  await yay 
  return // nay is considered awaited on

This has one special implication, when one of the initializers is throwing, every variable becomes throwing:

  async let (yay, nay) = ("yay", throw Boom())
  await try yay // yay itself does not throw, but since the initializer of the async let did it does
  // nay is considered awaited on; the Boom would be thrown here
  return

The "at least once on non throwing path" is about this:

  if (something) {
    _ = await try yay
  } else {
    throw Boom()
    // ok, we're throwing and everything will be cancelled and discarded.
  }

So the throw Boom() implicitly cancels and discards all async let tasks.

// edit: simplified examples.
// edit 2: realized it's much simpler than I thought; Thanks @Douglas_Gregor for sanity checking with me
// edit 3: Amended the proposal: clarifications for the throwing async lets by ktoso · Pull Request #28 · DougGregor/swift-evolution · GitHub

Anachron · November 6, 2020, 11:19am

Ok, I'm late to the party - way later than I wanted to be.

Brief comment on above conversation: I'd be more in favor of let async than async let because my mental model of async is more like syntactic sugar around a type.

Also regarding types: Which type would the compiler infer for veggies, meat and oven? Something like Task<Vegetable> and so on? Well, ok. Now, what is the type of [veggies, meat]? Probably [Task<Ingredient>](assuming that you once again apply some magic that we don't have access to to tell the compiler that tasks are covariant in their generic argument).

Finally, the compiler then somehow knows that await, when applied to arrays of tasks, should execute the tasks concurrently (I see no other point where the compiler could possibly infer that it's safe to concurrently execute tasks). Well, that's something! Implicitly, a conversion must have take place between [Task<X>] and Task<[X]>.

Conclusion: if await should be applicable to any expression containing Tasks (which I guess is intended), the user would have to know that specifically for arrays this entails concurrent execution. Honestly, I'd be more in favour of an explicit conversion with, e.g., a zip function. There are reasons why Swift is very restrictive with implicit type conversions, so we should be here as well.

Now, let's have a look at the return type of Task.withNursery. Since it is a static method of Task, it is obviously free to return anything it likes, but I'd usually assume this would be a named initializer somehow. The nursery in the example consumes tasks that produce (Int, Vegetable) and the closure returns [Vegetable], so the whole thing - if it should be understood as a named initializer - would return a Task<[Vegetable]>.

But no! If we look at the return type of func chopVegetables(), we actually get async throws -> [Vegetable], i.e., Task.withNursery has to be async throws -> <ClosureReturnType>. Curiously, no await in front of the method call.

That leads to the following question: is it safe then to think of async (applied to funcs or let) simply as an alternative spelling for Task??? And is it possible that this whole await thing mostly serves as a means to "unwrap" the task (without passing visually nasty continuations with their own scope) so we can chain them just the way we chain optionals or throwing functions? If that should be our mental model, I would argue that this should be pointed out somewhere so we better understand what is going on. It might also help with the design and implementation of future primitives/combinators.

Edit: Regarding naming: Task may be a bit unfortunate because it can easily be confused with Process.

ktoso · November 6, 2020, 11:49am

It modifies the entire declaration (the let) really -- the entire right hand side becomes wrapped in an implicit async function if you will. The transformation really is about the let declaration -- see also my previous post and adjustment to the proposal which goes deeper into this.

No. It is "plain old" types, yet they happen to get tainted with "has to be awaited before use".

This is weird at first, but once you get used to it you "get it" -- the entire reason this is so, is to enforce structured concurrency. You cannot, without a Task.Group (new name for nurseries, adjusted in the proposal already), spawn dynamic numbers of things. And you cannot, without Task.Handle just pass around not yet completed values -- you must await on them. As such, async let does not introduce any type changes, because then you could pass it around into some other function which is the precise thing structured concurrency wants to prevent. If you want to escape structured concurrency limitations, you'd reach for a Task.Handle<Carrot>.

There is an await there, Task.withGroup (previously known as Task.withNursery) is indeed async; please use the full document linked above as source of truth, as we have been fixing multiple such typos in the proposal.

// @John_McCall @Douglas_Gregor do you think it would make sense to collapse (maybe "fold" or remove?) the proposal texts from the "first" post in all the proposal threads so that people are directed to the proper source of truth - the ones on github?

Yes and no. Yes, they're very related to tasks; I.e. an async function runs within a task, however an async let is a very very specific construct and it is the spelling to create a child task that is also forced to be awaited on before the current scope exits (or is cancelled when it throws). It is not the same as-if just declaring some Task.Handle (which is what Task.runDetached does), as those do not get this structured concurrency treatment (noone will force you to await on a Task.Handle, while on an async let the compiler will force you to await on it). This is explained in the writeup I believe.

Please refer to the async/await proposal which explains precisely those parts in more detail: https://github.com/DougGregor/swift-evolution/blob/async-await/proposals/nnnn-async-await.md#motivation-completion-handlers-are-suboptimal

// Note to self, add more cross links to the proposals

Anachron · November 6, 2020, 12:33pm

Well, that sounds very much like a new type to me ;)

Ok, then this would be another type where the semantic appears to be "execute right away, I will need you later" and "not using the return value is an error" - a relevant type (Substructural type system - Wikipedia). The relevant type part sounds ok to me, but having tasks start simply by declaring "I need the return value"? Not sure if I like that.

But when I do a return await foo(), I kind of do pass an async value around, don't I? Because function I'm writing will have to be async again. So I kind of do escape something async here, as whenever await unwraps stuff, it immediately turns the whole scope into async.

jayton · November 6, 2020, 12:37pm

I suppose the obvious follow-up question is: if there is a desired executor, which is not obvious to the compiler, how may one go about making it so?

pertti · November 6, 2020, 1:53pm

If have

async let (foo, bar) = (takesALongTime(), failsQuickly())
await try foo

will the await need to wait for takesALongTime() to finish, or will the exception from failsQuickly() make it throw and cancel takesALongTime()?

xAlien95 · November 6, 2020, 2:17pm

That's how some languages, e.g. JavaScript, handle async functions. In JavaScript every async function is explicitly converted into a Promise instance (and the opposite is also true, you can mark any Promise instance with await). That mental model doesn't apply here. The way these proposals want you to think of async functions is in terms of how throwing functions work. That's why we have async near throws in the function type signature and not near the result type:

// proposed concurrency
func foo() async throws -> Int { ... }

// with async types
func foo() throws -> async Int { ... }

With JavaScript's model in place, you would have explicit async types, i.e.

func asyncFuncReturningInt() -> async Int { ... }

let x: async Int = asyncFuncReturningInt()
let y: Int = await x

with async Int and Int being two different types. As you already noted, this approach has scalability downsides. You have to explicitly add covariance to collections of async types, to functions having async types in their type signature, to tuples having at least one async type and so on, if you want to be able to place a single await in front of an expression that has some async types involved in it.

Instead await mirrors try's behavior. You don't need to mark with try every possibile throwing function involved in an expression (even though it's possible), you can just place a single try at the beginning of that expression. Similarly, throwing functions do not return Result<ResultType, Error> or similar wrapper types, they just return ResultType like their non-throwing counterparts.

// throwing functions
func t1() throws -> Int { 7 }
func t2(_ x: Int) throws -> Int { x + 1 }

// these are all equivalent
print(try t2(try t1()))
try print(t2(try t1()))
print(try t2(t1()))
try print(t2(t1()))

// asynchronous functions
func a1() async -> Int { 5 }
func a2(_ x: Int) async -> Int { x + 1 }

// these are all equivalent
print(await a2(await a1()))
await print(a2(await a1()))
print(await a2(a1()))
await print(a2(a1()))

I.e. a single await at the beginning and you're done.

@ktoso, will it be possible to explicitly pick the right overload by specifying its type? I'm running the November 4 snapshot and

func foo() -> Int { 2 }
func foo() async -> Int { 3 }

let bar = foo as () async -> Int  // no compiler error
let baz: () async -> Int = foo    // no compiler error

print(type(of: foo))  // prints () -> Int
print(type(of: bar))  // segmentation fault if run
print(type(of: baz))  // segmentation fault if run

func run() async {
    print(type(of: foo))  // segmentation fault if run

    await print([foo(), (foo as () -> Int)()])  // works, prints [3, 2]
    await print([foo(), (foo as () async -> Int)()])  // compiler error
    await print([foo(), bar()])                       // compiler error
    await print([foo(), baz()])                       // compiler error
}

await run()

Lantua · November 6, 2020, 3:24pm

Overriding Actor.execute should be enough to achieve that.

Agreed. Though the point I was getting at is that I'm not 100% sure that we'll nail the correct structure on the 1st try during the pitch+review period. Maybe nursery needs to separate consumer/producer, or withNursery returns Task.Handle, etc.

That's why I think we should incubate structured concurrency as a preview library. Which made me realized that we're missing a facility to convert a async function into callback (for Task.Handle). Other than that, most of the Task APIs can be done on top of the actor APIs.

John_McCall · November 6, 2020, 6:16pm

Calling actor.run { ... } as the first thing in the task should make it obvious enough.

John_McCall · November 6, 2020, 6:23pm

Sorry, what is a callback but a function value? What are you imagining this would look like?

We really want to discourage people from doing higher-order manipulations on task handles as is common with a lot of functionally-oriented futures libraries. If you're fetching something and then doing something with the result, you should be structuring those into the same task.