SE-0304: Structured Concurrency

frameworklabs · March 18, 2021, 2:30am

I really like the general direction of this proposal.

The only thing I would really like to see being added though is the possibility to state on Task.Group.add/spawn whether the child task is required to have stopped for the group to be able to stop. This would be the default. But some tasks could opt out of this strong behavior and will get cancelled instead when the strong tasks have all stopped.

This makes it really easy to have child tasks which run concurrently to the others but don’t determine the temporal dimension of the group. Imagine a child task in the running Demo to play music while cooking or control the lights while doing so etc. These tasks should not prevent the group from finishing but just get cancelled on group end.

pierremb · March 18, 2021, 10:43am

That is an interesting thought. Quick thought : it might also be a way to handle timeouts.

xwu · March 18, 2021, 10:52am

hisekaldma:

ktoso:

I hear you loud and clear on this one. It's really too bad that async let and anything with "nicer to look at syntax" was decided to be punted out of this proposal and into a future one; as really the groups API here is really not what most people will be interacting with at all on a daily basis, unless building your own concurrency "operator" like a scatter-gather or "race" or similar.

I can’t speak for @xwu, but for me this is not a question about "nicer to look at syntax". This is a question of the API not expressing a clear conceptual model that is easy to understand. async let will not make the conceptual model clearer – at best it will hide it so that I don’t have to think about it. (That might also be a worthwhile goal, but it’s a different goal.)

Well said, and I agree with @hisekaldma's point here completely. A clear conceptual model can only increase in clarity the more we remove non-essential "sugar" from the mix. That taking away async let has made the proposal potentially harder for users to understand is a symptom and not a cause: it points to an underlying problem with the design of what is included.

The bottom two responses, from @ktoso and @hisekaldma, both vividly demonstrate exactly why the naming does matter and why it's absolutely not a matter of "style" or "preference."

To reiterate: nesting implies a hierarchy of is-a or has-a relationships. Is an Encoding.UTF8 a kind of Encoding? Yes. Is a Task.Group a kind of Task? No. That Joe feels compelled to clarify this confusion for a specific argument label is a sign that something is not right here. This has absolutely nothing to do with one's opinion about namespaces as an API naming stylistic choice.

Consider how this confusion is instantly clarified by (a) not namespacing; and (b) using a more evocative name: Is an Agenda a kind of Task? Self-evidently not: if it were, it'd be a nested type, a subclass, or conform to some common protocol.

ktoso:

xwu:

it does not make sense to me to try to pessimize the naming of a type.

To preserve the intention of futures taking a backseat in this design, there are other tools in the API design toolbox; we can look, for example, to the precedent of Result deliberately vending a limited API for the purposes that it's meant to serve while making clear that it's not meant to be an alternative to supplant throwing functions.

I'm not sure I agree with this; It is not only "pessimizing" the name, but also setting expectations correctly: Looking at this as a "language immigrant", coming from another language (e.g. Scala, Ruby, JS, Java, Kotlin...), or even just Swift with pervasive PromiseKit, NIO or Combine's Future types seeing a Future in Swift would trigger the "I know how to use this, map(), flatMap(), right?" and then disappointment (and maybe finally enlightenment?)

This doesn't "set expectations"; it's really a misdirection. If a user expects a "future-based" concurrency design, this choice merely divorces those expectations from the Task.Handle type, but they'll still expect or want a Future type just as much. It cuts them off from applying existing understanding or experience to Swift (including their knowledge of the shortcomings of a future-based concurrency design).

If you came over to my house expecting a dinner feast and I had only potato chips, I could "set expectations" by pretending that there's no food in my house, but I bet you'd be disappointed if you saw me eating potato chips afterwards. And by telling you there's no food at all instead of just telling you about the snacks, I've deprived you of the opportunity to decide if all you want to have are snacks anyway, or if you'd like to order your own food.

I also feel very strongly about it: As a Canadian, I'd spell it NSColour too, but that's simply not how we spell things in Swift. All APIs in the Swift standard library adhere to standard U.S. English conventions, and the rules regarding the doubling of consonants before "-ed" and "-ing" (which differ from international English rules) are no exception. The burden is on those who wish to diverge from these rules to explain why, in this setting, they should not apply. A good reason might be that as a term of art something is always spelled in a nonstandard way (e.g., "HTTP referer"), but I'm aware of no such case for "canceled," which meanwhile is a term gaining cultural currency.

ktoso · March 18, 2021, 11:06am

xwu:

ktoso:

Also some people expressed that they feel super strongly about it, so indeed the current and expected spelling is cancel, cancelled, cancellation - as weird as it it.

I also feel very strongly about it: As a Canadian, I'd spell it NSColour too, but that's simply not how we spell things in Swift. All APIs in the Swift standard library adhere to standard U.S. English conventions, and the rules regarding the doubling of consonants before "-ed" and "-ing" (which differ from international English rules) are no exception. The burden is on those who wish to diverge from these rules to explain why, in this setting, they should not apply. A good reason might be that as a term of art something is always spelled in a nonstandard way (e.g., "HTTP referer"), but I'm aware of no such case for "canceled," which meanwhile is a term gaining cultural currency.

I think I was pretty clear, but to clarify who the convincing should be directed at: the core team specifically requested the current spelling, and we reverted my change that aligned the APIs with with proper US grammar. I very much agree that proper grammar spellings would be the right thing to do, but this was met with strong opinions and explicit request to revert.

So yeah, for what it is worth: very much agreed on the spellings on a personal level at least.

pierremb · March 18, 2021, 11:27am

+1 on the general direction, it might still need a bit of bikeshedding

Definitely: callback hell is real and it burns.

Mostly yes, it looks like is a good ground for awesome features and code.
More below.

No, only played with the concurrency preview in swift.

A quick read of the last version of the proposal. But I thoroughly read previous ones, and I have followed concurrency proposal and threads (pun not intended) for some time.

My own preferences are for spawnDetached, wether namespaces or not, and group.spawn.

I agree that having both static and instance methods is pretty confusing. And I believe instance methods are better. Maybe we could have the following extension, removing the need for static methods, or would it just be a different kind of confusing?

extension Optional where Wrapped == Task {
    var isCancelled: Bool { wrapped?.isCancelled ?? false }
}

jeremy · March 18, 2021, 11:29am

https://developer.apple.com/documentation/foundation/urlerror/2293052-cancelled

(where the documentation for the URLError.cancelled property says "An asynchronous load has been canceled")

JJJ · March 18, 2021, 2:48pm

Nice. When will we be able to bikeshed async let? I think the "async" in "async let" is a different kind of async... And perhaps the thread about "async let" could inspire the naming of spawnDetached()?

I think as you can "await" (the result of) a function:

let dinner = await makeDinner()

it could make sense to be able to "defer" getting the result until the end of the current scope at the latest:

let dinnerHandle = defer makeDinner()

So I would propose changing async let x = y to let x = defer y.

Alternatively

let dinnerHandle = makeDinner() on the side

Chris_Lattner3 · March 18, 2021, 4:45pm

I don't recall this case personally, but I want to make a general observation:

The core team is made up of people, and people make mistakes (at least, I certainly do!). The core team highly values arguments based on agreed-on principles that keep the language and APIs more consistent. If a call was made that detracts from that based on a principled argument, then we should be open to discussing it and potentially revising a decision - we shouldn't just go with "the core team says so".

That said, we shouldn't open or reopen every debate - there are many things where we "just need to make a call", and there isn't a strong principle that guides one answer vs the other. In those cases, it is really important that the core team exists to just pick a winner: arguing endlessly about would just generate heat but not light.

One other thing that is important to distinguish: there is a difference between the individual humans that sit on the core team (e.g. me) who make suggestions or have requests during the design and iteration of a proposal, and the final official "judgement" made at the end of a proposal review. Guidance from the individuals can be useful to help shape things in ways that make it more likely to work based on their experience, but it is really the core team as a whole that makes the final judgement.

-Chris

frameworklabs · March 18, 2021, 7:20pm

Yes, a strong task would define the lifetime of the group and the weak tasks would follow it.

This distinction between strong and weak tasks will make structured concurrency so much more versatile.

Alex_Martini · March 18, 2021, 8:29pm

Regarding the spelling of "cancel", the reason folks have advocated for a different spelling is to follow Apple Style Guide:

canceled (v.), canceling (v.), cancellation (n.)
Use one l for the verb cancel —for example canceled , canceling . Use two l ’s for the noun cancellation .

There are some APIs in Apple's frameworks that don't follow this spelling, but that's like HTTP Referer — it was too late to change by the time it was noticed, not a precedent to be followed.

frameworklabs · March 19, 2021, 2:18am

Just to expand a bit on it. Other structured concurrency implementations have constructs like par/and and par/or to specify the termination behavior of a group of tasks.

Let's say you have three tasks (A, B and C) and want B and C to be stopped when A has stopped, but not the other way round:

par/or {
  spawn {
    // Task A
  }
  par/and {
    spawn {
      // Task B
    }
    spawn {
      // Task C
    } 
    spawn {
      // Dummy Task which does not end
    }
  }
}

The alternative I am advocating here (and which I saw implemented in this structured concurrency implementation) is to specify the group termination behavior like this instead:

par {
  spawn {
    // Task A
  }
  spawn weak {
    // Task B
  }
  spawn weak {
    // Task C
  }
}

Tasks would be strong by default so you have to mark only weak tasks.

Lantua · March 20, 2021, 5:51pm

Getting the handle's task allows us to check if the work we're about to wait on perhaps was already cancelled (by calling handle.task.isCancelled )

This doesn't really apply now, since we got handle.isCancelled (handle.canceled?) as well.

In fact, it's really nagging me that we have two isCancelled this close to each other.

One minor thing, this sentence:

Task priorities are set on task creation (e.g., Task.runDetached or Task.Group.add ) and can be escalated later, e.g., if a higher-priority task waits on the task handle of a lower-priority task.

is used twice. Might be a typo.

Matt_McLaughlin · March 23, 2021, 1:43pm

I’ve reread the proposal a few times and this comment keeps rolling around in my head. I agree with comments below that the name spacing is hindering more than it’s helping. But I think it’s more than that.

The name “detached” defines that task in relation to where it was spawned, not what it is. A detached task is the top level of a new hierarchy. It can have children. I think the model would be clearer if the term of art for a detached task described its “top of the hierarchyness” rather than it’s relationship to where it was spawned. The detached information could be conveyed in the keyword for the spawn, I.e, spawnDetached PrimeTask or spawnDetached TopTask

We’re also missing a term of art for the type of task that can create a Group. It’s just a generic task that happens to have a .withGroup on it. I think something was lost when we dismissed using the term nursery. The feeling at the time seemed to be that the name was too whimsical for Swift, but listening to the repeated requests for clarity I’m wondering if that was a mistake. What Nursery gives you is a clear naming distinction between the object that creates groups of tasks and the groups themselves. That distinction is important for communicating the model even if you may not agree nursery is the right name. (If we’re going with spawn then SpawningGrounds... J/K). The other nice thing about Nursery is that it reinforces the child task concept since nurseries are places where children are raised.

Lastly, I think that if we prioritized communicating the model over conciseness we’d do something like .addChild rather than just .add in our ~~Nursery~~ Task.withGroup. I get that that would add noise but it would help immensely with teaching the model.

Paulo_Faria · March 23, 2021, 7:14pm

This made me think that the invocation should not be a function. IMO, it should be a proper initializer to reinforce the importance of the type. I honestly haven't thought this through as much as I would like, but I don't like the idea of Task.Handle. First of all it seems like an archaic concept that reminds of C and doesn't feel like home in Swift. Maybe instead of a handle we need a proper type called DetachedTask.

let dinnerTask = DetachedTask {
  try await makeDinner()
}

Having its creation be a proper initializer enforces the idea that it is a proper type in its own rights, not a handle.

--- Edit

This might seen like bikeshedding, but I feel that thinking of a detached task in terms of a proper type can help us understand the whole concept of structured concurrency a bit better.

Paulo_Faria · March 23, 2021, 8:24pm

Another advantage of using an initializer is that instead of this:

let dinnerHandle: Task.Handle<Meal, Error> = Task.runDetached {
  try await makeDinner()
}

We can spell it more succinctly, like this:

let dinnerTask = DetachedTask<Meal, Error> {
  try await makeDinner()
}

xwu · March 23, 2021, 11:06pm

It’s already confusing enough, I think, to name a future type Task.Handle. I’m not sure why we would want to call a future a DetachedTask when it would be neither a task nor even necessarily detached.

Paulo_Faria · March 23, 2021, 11:08pm

OK, I had some time to properly think about the issue and I have some suggestions.

1 - Remove the Task.Handle type

We don't need to introduce the concept of a handle to achieve the functionality we want. I mentioned before the term "handle" reminds of C, but that's not the main issue. The main issue, is that its concept is simply not required. I propose we move all the API that now resides in Task.Handle to the Task type. To properly accommodate this change we'll also need 2. The benefits to this are that we do not need a new concept to explain. The type itself is just renamed/moved around, but we don't need to hold a new concept in our heads since the Task type and the task concept is enough. It might be pure nitpicking, but canceling a task makes more sense then canceling a handle. Might be a stretch, but one could think of handle.cancel as the act of disposing the handle itself and not the task. task.cancel does exactly what it says it does. It cancels the actual task. Also, having the task be self-contained also does away with the handle.task.isCancelled dance and gives you task.isCancelled directly.

-- Edit

Just saw that the proposal mentions that Task has an isCancelled property (even though it still mentions handle.task.isCancelled) which is nice, but also still weird, because now you have two ways of doing the same thing. Checking for cancelation with handle.task.isCancelled and with handle.isCancelled. Getting rid of Handle completely solves this. Same could be said about priority, if it was ever intended to add that property to Task.Handle as well.

2 - Introduce the CurrentTask type

I propose we move the non static API of Task to a new type called RunningTask. I specially like this one since it brings parity to the current and unsafeCurrent function/property and the CurrentTask and UnsafeCurrentTask types.

extension Task { 
  static var unsafeCurrent: UnsafeCurrentTask? { get }
  static func current() async -> CurrentTask { get }
}

3 - Switch from Task.runDetached to Task.init

Like I mentioned before. I think an initializer emphasizes the type that we're dealing with. Also helps with API like I mentioned above. Reiterating:

Instead of:

let dinnerHandle: Task.Handle<Meal, Error> = Task.runDetached {
  try await makeDinner()
}

We can spell it more succinctly, like this:

let dinnerTask = Task<Meal, Error> {
  try await makeDinner()
}

I know that sometimes you don't care about canceling the task or getting its result. This would make the following code a bit weird:

Task {
  try await doSomethingNotCaringAboutCancelationOrResult()
}

I agree that @discardableResult inits are weird, but having Task { ... } reminds of SwiftUI and result builders which also look like @discardableResult inits. I know this will probably be a contentious suggestion, but I think it still has its merits. The fact that it resembles SwiftUI also aludes to its declarative nature, which, personally, I think it's nice.

4 - Create a completely new type called TaskGroup

I agree that task groups are hierarchically above tasks and Task.Group doesn't really reflect that. I think the concept of task groups deserve to have their own type, TaskGroup. This will also be important for 5 and 6, which, alongside 4, are basically the same suggestion. I decided to split them to make them easier to follow, though.

5 - Switch from Task.withGroup to TaskGroup.init

The reasoning is similar to 3, but this is specially important for 7. Task.withGroup used to return the TaskResult generic type. I propose it actually returns a new type called TaskGroup. The reasoning will become more clear when I introduce 7.

6 - Rename what used to be Task.Group to TaskGroup.Builder

I'm not married to TaskGroup.Builder, but we need to differentiate it from the actual TaskGroup type. I'm used to use this naming scheme when using the builder pattern that is currently used in this API. I'm open to suggestions, though. 4 is important for this, otherwise we would have Task.Group.Builder, which has too many dots, IMO.

7 - Add cancelation to TaskGroup

This is the most important one, IMO. Before I move on, this needs a bit more context. I'm one of the main developers of Venice. Even though Zewo itself, the organization behind Venice, hasn't been maintained for a long time. A proprietary version of what was once Zewo is still being used to this day on a healthcare application used in New Zealand hospitals. All of this is to say that we have been using structured concurrency in Swift, through libdill (which is all the magic really happens), for about 5 years now. Libdill has the concept of bundle which, in Venice, is translated to Coroutine.Group (yeah, I know the irony in proposing 4, since Coroutine.Group is basically the same as Task.Group). The main advantage of having access to a group outside of its scope is to allow cancelation from elsewhere. One example of Coroutine.Group usage can be seen in the very rudimentary Zewo HTTP server.

github.com

Zewo/Zewo/blob/master/Sources/HTTP/Server/Server.swift#L101


      
                  } catch {
                      Logger.error("Error while accepting connections.", error: error)
                      throw error
                  }
              }
          }
          
          
/// Stop server
          public func stop() throws {
              Logger.info("Stopping HTTP server.")
              group.cancel()
          }
          
          
public static var defaultHeader: String {
              var header = "\n"
              header += "                        _____                         \n"
              header += "     _.-ˆˆ-._.-ˆˆ-._   /__  /   ____ _      __ ____   \n"
              header += "    |ˆ-._.-ˆˆˆ-._.-ˆ|    / /   / __/| | /| / // __ \\ \n"
              header += "    |   |ˆ-._.-ˆ|   |   / /__ / __/ | |/ |/ // /_/ /  \n"
              header += "    ˆ-._|   |   |_.-ˆ  /____//____/ |__/|__/ \\____/  \n"
              header += "        ˆ-._|_.-ˆ                                       \n"

The Server.stop function allows the group that accepts incoming connections to be canceled from outside of its scope.

I propose we basically add similar Task APIs to TaskGroup. Basically, TaskGroup.cancel and TaskGroup.get. The examples in the SE proposal would become.

 func chopVegetables() async throws -> [Vegetable] {
  let group = TaskGroup(resultType: Vegetable.self) { group in
    print(group.isCancelled) // prints false

    await group.add {
      group.cancelAll() // Cancel all work in the group
      throw UnfortunateAccidentWithKnifeError()
    }
    await group.add {
      return try await chop(Onion())
    }

    do {
      while let veggie = try await group.next() {
        veggies.append(veggie)
      }
    } catch {
      print(group.isCancelled) // prints true now
      let added = await group.add {
        return try await chop(SweetPotato())
      }
      print(added) // prints false, no child was added to the cancelled group
    }
  }
  
  return try await veggies.get()
}

TaskGroup.init wouldn't need to be async since it wouldn't actually await until TaskGroup.get is called. If one wants to cancel a group, it could be stored as a property and cancel could be called somewhere else.

I imagine there is a reason the current API requires developers to wait for the group's result in the same spot as the groups is actually created. Maybe passing the group around and canceling it elsewhere can cause inconsistent behavior? I would love to get this clarification from the OPs. @ktoso maybe can help?

Now, onto some questions.

8 - Consecutive calls to Task.get

What currently happens when consecutive calls to Task.get are made? Is the result stored and simply returned again? If so, I imagine TaskGroup.get should get the same behaviour.

There are more suggestions that follow from these, but this is enough for now.

Paulo_Faria · March 23, 2021, 11:13pm

Please, take a look at the suggestions above. Like I said, that initializer idea naturally flowed in a direction that I think fits better with the whole concept of structured concurrency. I agree that DetachedTask wasn't good enough to clean the API.

xwu · March 23, 2021, 11:25pm

Paulo_Faria:

Instead of:

let dinnerHandle: Task.Handle<Meal, Error> = Task.runDetached {
  try await makeDinner()
}

We can spell it more succinctly, like this:

let dinnerTask = Task<Meal, Error> {
  try await makeDinner()
}

I think you are missing the point of this proposal, which I would emphasize is a reflection of the difficulty to users of understanding the relevant concepts with the design as it is expressed.

The purpose of introducing tasks and task groups is to make possible structured concurrency.

Not only is Task.runDetached not structured, it is precisely the unstructured concurrency that this proposal is aiming to solve. Strictly speaking, it should not even be a part of this proposal, and that’s why I have urged the authors not to place it under the Task namespace.

The entire point of Task is that it’s what you’re best off using instead of Task.runDetached.

Paulo_Faria · March 23, 2021, 11:51pm

I don't think so, I'm confident enough to say that I do understand what the proposal of structured concurrency is about. I do, however, disagree with your POV on the Task.runDetached subject. You can't say that I do not understand proposal. I just disagree with you. I'm quite glad to say that, even though I did not implement libdill (all merits go to Sustrik), I've been working with structured concurrency with Swift for quite a long time, now.

Now back to the point. Wether you like it or not, the structured aspect of structured concurrency needs an entry point. I don't agree it should be part of another discussion, at all. I also don't agree it should live in another namespace other than Task. To me what defines a task is a piece of work that can spawn more work (in a structured fashion) and can be canceled and, in turn, cancel the work it had spawned (the structured part). I don't see how it is not part of the structured aspect.

--- Edit

I think I see your point, now. It's late here so I'll have a think tomorrow and see if it I can propose better suggestions based on this insight. I think my experience with libdill ~~might~~ definitely have affected how I understand this proposal, indeed!