SE-0304 (2nd review): Structured Concurrency

I'm also in favor of single verbs, and specifically: spawn (always child task), detach always NOT-child task, because it is simpler to reason and teach about:

  • A: okey how do I make some tasks?
  • B: you use spawn (covers group.spawn, the upcoming spawn let)
  • A: okey I know about spawn; what is detach?
  • B: Oh yeah don't detach, that's terrible, unless [...]

Whereas if both were "spawn" and "spawn but with some caveats, try not to use it". it becomes conflated in discussions and I can see this happening:

  • A: so I spawned a task but it didn't get cancelled, why?
  • B: how do you spawn?
  • A: spawnDetached, only thing the compiler would let me use.
  • B: Oh yeah never spawn detached, [explanation follows ...]

So IMHO by using a completely different verb we make it clear that it's not "some spawn that the compiler will accept here" but a different thing I need to separately learn about, even if I then in my head learn "okey, so like spawn but not a child-task".

--

I also still think there might be the need for one other verb, similar to send or something... where we want to not really detach but we're in a synchronous function and must call an async function; so we make a new task, but it inherits both execution context, priority, task-locals and everything else it can. With those three I think we'd have covered all use cases... The send I don't know if everyone agrees with, but pretty sure we'll need something for "sync calling async" that is better than detach.

6 Likes

Iā€“personallyā€“really dislike that await x.get() pattern and secretly wish for a re-throwing protocol like this:

protocol Awaitable { 
  associatedtype T: Sendable
  func get() async rethrows -> T (?) 
}

and writing:

await x // equivalent to await x.get()

but so far it hasn't been critical to introduce this sugar. Partially it was argued that "detach (and therefore Task.Handle) is bad and should be ugly" but maybe this is a thing to consider still?

To be honest this feels like one of those things we can add later once we gained more experience and know how often handles really do show up.

// This of course is inspired by how .NET has it: Asynchronous Operations: Awaitables though the exact semantics are just slight sugar for the get for us.

2 Likes

The connotation is not to "structured" but to "children". Presumably because of the dictionary definition of the word, spawn as a term of art is tied to creating child processes specifically. So adding it to detached tasks, which are not child tasks in the swift sense, would not just be unnecessarily verbose, but also incorrect.

It's maybe library-specific rather than an industry term of art, but detached does have an existing meaning in Grand Central Dispatch which is similar to the meaning proposed for Swift, which is to dissociate the work item's attributes from the current execution context.

7 Likes

I apologize for not having gotten to this review until way after the review period is up; I have other thoughts on it which, if it goes to another review, I would love to share.

However, on this point specifically, I have been incubating a thought for a while: The word that comes to mind which goes with spawn to indicate a "bunch" of things spawned is: brood. I wonder if it would be feasible even to use a result builder here so that the use site reads:

brood(CookingStep.self) {
  spawn { try await .vegetables(chopVegetables()) }
  spawn { try await .meat(marinateMeat()) }
  spawn { try await .oven(preheatOven(temperature: 350)) }
}

I agree with authors' choices in preferring detached to a compound word that includes spawn: I think it's nice to separate the word we use for the "attached" (structured) operation from that for the detached operation, whatever the strength of precedence for a term-of-art.

2 Likes

It's maybe library-specific rather than an industry term of art, but detached does have an existing meaning in Grand Central Dispatch which is similar to the meaning proposed for Swift, which is to dissociate the work item's attributes from the current execution context.

Yes, detach has a long history predating GCD with at least a decade+, used in at least e.g. Solaris threads with THR_DETACHED (1992) and in pthreads with pthread_detach and the PTHREAD_CREATE_DETACHED thread creation attribute (1995).

So I would say it's an established concurrency term/concept since quite a while back.

4 Likes

Hello everyone,
I'd like to share some performance findings and implications for this proposal.

We are trying very hard to optimize tasks such that for async let (soon to be known as spawn let) we'd be able to allocate the entire task object using task-local allocation (!) This will be a very noticeable performance gains for small child tasks created by async let declarations. These will potentially not have to malloc at all, and also don't need to be ref-counted either, both of which should cause noticeable performance gains.

This implies, that we must not allow arbitrary code to keep references to tasks around, resulting in the removal of Task.current from this proposal and APIs.

The API to "get" the task is not really necessary due to the existence of the static isCancelled, currentPriority properties, and task-locals being implemented as <taskLocal>.get().

This also reduces the duplication of APIs, there no longer are three ways to ask about isCancelled, but only one primary: Task.isCancelled, and one unsafe currentUnsafeTask.isCancelled.

The withUnsafeCurrentTask { (task: UnsafeCurrenTask?) in } API remains, and if used carefully is perfectly fine, but one must be very careful with not storing the task, as usual with all those withUnsafe... APIs.

27 Likes

I notice in the AsyncIteratorProtocol implementations that there is a public func cancel() method that stops iteration and cancels the group. Is this part of the proposal? I'm guessing it's there because of an earlier version of AsyncIteratorProtocol required it. Should the Iterator implementations now be classes so they can cancel the group in deinit?

However, it does make me wonder what the behaviour should be on early exit when iterating over a group?

for await result in group {
    if condition {
        break
    }
}

or, more subtly await group.contains("foo"), which exits the loop early on finding a match. Should this cancel all the tasks in the group? I don't think this behaviour is discussed at all in the proposal.

If the Task.Handle is only for detached tasks, then would Task.Detached be a better type name?

3 Likes

It should not, and does not cancel the group.

The only ways in which a group is cancelled are the following 3:

The three ways in which a task group can be cancelled are:

  1. When an error is thrown out of the body of withTaskGroup ,
  2. When the task in which the task group itself was created is cancelled, or
  3. When the cancelAll() operation is invoked.

as documented in the proposal:

it works like this:

  let sum = await withTaskGroup(of: Int.self, returning: Int.self) { group in
    for n in 1...4 {
      group.spawn {
        return n
      }
    }

    let three = await group.contains(3)
    print("three = \(three)")

    for n in 5...7 {
      group.spawn {
        return n
      }
    }

    let six = await group.contains(6)
    print("six = \(six)")

    for n in 8...10 {
      group.spawn {
        return n
      }
    }

    let nine = await group.contains(9)
    print("nine = \(nine)")

    return 0
  }

// three = true
// six = true
// nine = true

No substantive concerns but there appears to be an unfinished sentence leaving a bit of a cliff hanger:

This static isCancelled property is always safe to invoke, i.e. it may be invoked from synchronous or asynchronous functions and will always return the expected result. Do note however that checking cancellation while concurrently setting cancellation may be slightly racy, i.e. if the cancel is performed from another thread, the isCancelled

This makes a lot of sense to me! I don't think there is any utility in directly working with this, and in the future the ownership semantic work will give us better modeling power for "thing that cannot escape" through borrow semantics.

Responding to several points about naming of detach above - I still don't get the objections :-). The proposal is going in the direct of taking these operations out of the Task namespace, which I think people all agree is the right way to go. However, the cost of doing this is that the names are no longer associated with the word Task at the call site - this means the names need to stand on their own in a way they don't when they are in the Task namespace.

For spawn, this works. This word is strongly associated with concurrency, and it is rarely used for anything else (it's a relatively obscure word). This is why spawn as a term of art works, just like sin for math works: while there are theoretically other interpretations of spawn and sin as words in the english language, there are not strong associations with them in programming languages.

For detach, this isn't the case. detach is a general verb used for a wide variety of purposes in APIs that have nothing to do with concurrency: detach is an active verb in StackView APIs, you detach a tab from a web browser, etc. The fact that it happens be used in GCD (somewhere? not really sure where, doesn't appear to be widely used) doesn't affect this - it isn't a term of art strongly correlated with concurrency.

Coming back to why this matters, we're talking about making it a top level global function, not something buried under the task namespace. Not having this strong term of art connotation is problematic for two reasons:

  1. people seeing it floating in code (e.g. in some UI code working with detachable things) could find it confusing.
  2. we are likely to have name shadowing problems, e.g. you subclass NSPopover and find out that it has a hidden detach method already.

Beyond that, I don't see any value in keeping this word this short. This will most often be used with closures, so there aren't likely to be long line wrapping problems or anything else that we will argue strongly for a very terse name. It seems strictly better to go with spawnDetached, aligning "things that create tasks" with the spawn verb, clarifying the problems above, etc.

Some folks upthread mention that the word spawn is closely aligned with the notion of "child" process, which is true. I'd respond that a detached task is a child process, it just isn't attached to its parent. This is further reason that the name spawnDetached works well.

-Chris

14 Likes

I never saw it used anywhere in GCD, but it is a widely used term for threading libraries. pthread has pthread_detach(), c++ has thread::detach(), Foundation has -detachNewThreadSelector:toTarget:withObject:, etc.

It does not make it strongly correlated to concurrency, but it make its meaning in the context of concurrency pretty clear at least.

There it is: DISPATCH_BLOCK_DETACHED | Apple Developer Documentation

Yes. Also mentioned here:

As to the naming of detach() then IMHO it works great if it's something that we want people to use very often.

If we want people to instead reach for the "with*** ... spawn" machinery, perhaps it would make sense to choose a longer and slightly more cumbersome word to signal that? :slight_smile:

This is the crux of what Chris is pointing out. When detach was namespaced to Task, you have the context. Now that itā€™s a global function (a move Iā€™m very much supportive of) you donā€™t have the context. So it makes sense to add the context back in so that people can pick it out when scanning code. spawnDetached does that nicely IMHO.

I also think that the conceptual model is simpler if we consider both withTaskGroup task creation and detached task creation as creating ā€œchild tasksā€. Then we make the distinction between attached child tasks and detached child tasks.

I can't help myself, but perhaps natural choice if we continue to discuss child tasks, would be orphan() for the detached case. :joy:

3 Likes

Actor systems and runtime internals are no stranger to macabre puns about these things but Iā€™ll spare them here :wink: (Our Swift runtime is clear of such jokes/puns though).

ā€”

No, letā€™s please not muddy the waters of what child tasks are. We can debate verbs to spawn things all day long, but a child task is very well defined: it is a task with a parent task, and as suchā€“by construction and in order to uphold required runtime safety guaranteesā€“it must not ā€œoutliveā€ the parent; like nodes in a tree structure.

A detached task cannot be a child task, because it has no way to enforce the lifetime relationships (and we do not want it to be able to). It is like a ā€œrootā€ in a tree, it does not make sense to call a root node a ā€œchild nodeā€*. Regardless what words weā€™d end up using for spawn, detach etc.

In other words:

  • Only tasks which keep the structured concurrency guarantees of parent/child relationships can be called child tasks.
  • Swift Concurrency, currently, has exactly two APIs allowing for the creation of child tasks:
    • task group group.spawn, and
    • spawn let which is the revived proposal previously known as async let many months ago, and weā€™ll be posting it shortly.

// * funny side story; Akka actors always have a parent. But who is the parent of the root actor...? We said all actors have a parent, so there must be one! There we called it theOneWhoWalksTheBubblesOfSpaceTime, as it is outside of the bubbles of space and time, and does not respect any of realityā€™s rules :wink:

5 Likes

There are two separate issues here:

  1. Whether detach is too terse/ambiguous
  2. If it is, whether spawnDetach is the best alternative.

Re 2: spawnDetached is the wrong term. The word spawn has meaning, and that meaning contradicts the meaning of the proposed behavior of detach. Saying "a detached task is a child process, it just isn't attached to its parent" is swimming hard upstream to justify using the wrong name, and implies that it will actually damage the understanding of what spawn means, too. If we must choose a verb phrase instead of just a verb, there are alternatives that don't have this problem (like detachTask).

But it should not be a verb phrase. Omitting needless words is not about line wrapping. More words take longer to read, adding unnecessary weight to the code and making it harder to read and quickly understand. It's also a simple matter of aesthetics. detach { model.add(thing) } just looks much nicer than detachTask { model.add(thing) }, just like case let x?: looks nicer than case let .some(x): (or indeed case let Optional.some(x):). And a need to quickly detach from a synchronous function is a common-enough thing to need that that improved aesthetic matter.

The need to detach a task is very common, especially when operating with existing code bases, but even in brand new codebases. Experience porting existing code bases to the model suggests its use vastly outweighs the need for calling spawn, because of the frequent need to get into an async context from a synchronous function (which spawn will not help with). Giving it is a heavier-weight two-word name would have a significant negative impact on this use (and this is actually a large part of the motivation of hoisting it out of the Task namespace).

Put another way: what is the benefit of keeping spawn succinct, but detach not? Because "we prefer structured concurrency" still seems the motivation here and I think a misguided one.

The fact that it happens be used in GCD

There is also pthread_detach ā€“ again, in a context that aligns with the proposed usage in Swift.

I don't find the notion that detach might be found confusing when used e.g. in an NSPopover a compelling argument. This potential confusion seems unlikely to me.

12 Likes

disinherit

1 Like