SE-0304: Structured Concurrency

Chris_Lattner3 · March 24, 2021, 12:02am

I agree with Xiaodi on this. To largely restate what he said above: while the proposal's name is "structured concurrency", that the Task.runDetached function (whatever it is ultimately named) is the unstructured part of the model, as is the thing it returns (currently named Task.Handle).

The thing it returns is fundamentally "future-like", the significant thing that makes it unusual is that it allows best-effort cancellation of the task that is producing the future value, so it is really does seem to be something like a "CancelableFuture" or something like that.

The thing it returns is not itself a task, it is more that the thing it returns has a pointer/handle to the task that produces it. All of this argues against unnesting this stuff from the Task namespace - which doesn't seem to be super controversial at least.

-Chris

Paulo_Faria · March 24, 2021, 12:12am

Yeah! I think I understand the issue! I conflated Venice's Coroutines with Task.runDetached's Task.Handle because both of them are cancelable. I understand now that Task.Group is actually what parallels with Venice's Couroutines + Coroutine.Group, conceptually. I still think it's weird that you can't cancel task groups from outside, though. I'll ponder about this some more tomorrow and bring back what I find. @xwu sorry for misunderstanding your point.

ktoso · March 24, 2021, 1:31am

I’ll get through the details here shortly, though a quick one:

You can.

By cancelling the task that is running the group.

The design of a task group and it enforcing that “once it returns no more child tasks are in flight” allows for very specific and sneaky allocation optimizations — the group itself does not even have to heap allocate, nor do many other small objects that it creates (child records in the current task). Making the group scope return before everything is completed defeats these optimizations.

I totally think there is space for some “group-like thing that we can add to and cancel from the outside” but it would be something slightly different. Most notably those are useful for “bind the lifecycle of this task-something-container with the lifecycle of this class”, i.e. when the class gets deinit, the related task-collection should as well. I think there is space for such abstraction, but it is not a core primitive like the TaskGroup under discussion — which is specifically for structured concurrency; while the latter one is not.

—

let handle =Task.runDetached { await Task.withGroup { ... } }
handle.cancel()

// or by simply the parent/child chain of tasks resulting 
// in cancelling the task that is running the group

hooman · March 24, 2021, 2:36pm

As I noted in my response to Actors proposal, I am very excited about how things are shaping up with Swift concurrency story.

To the whole team: You are doing monumental work, thank you guys. You are making history by what you are doing here.

Again, though, I think we are not there yet with this iteration of the proposal. I understand that this one is really fundamental at the same level of async/await, and we have to push it through rather quickly, but please be patient everybody. We have already waited for so long for Swift native concurrency.

I agree with teachability issues that have come up. We see too much attention paid to runDetached and migration/interoperability stuff. We should instead focus on driving home the whole point of structured concurrency and how to design for it. I think we should refocus the base structured concurrency on pure concept and defer the question of how does it fit with what I already have to a follow up proposal.

I also think you should focus on a distilled minimal core set of operations that capture the essence of the concept and hint at how they can be used to build the day-to-day stuff out of them and defer it to the follow up proposal. This will aid in teachability of the feature and help people focus on the correct stuff.

We could pass async/await quickly because many people are already familiar with very similar concept already widely used in other languages. Currently, structured concurrency is not so widely used and is a very important fundamental cornerstone of Swift concurrency story. It deserves more focus on the core concept in the initial proposal. To be able to properly focus on the concept we need to shed the other stuff for now. "How to opt out of the structured concurrency" should not become the focus of the thread introducing structured concurrency!

breathe · March 24, 2021, 3:58pm

I'd really love if there was a way in the type system to express that a given function call cannot create any detached tasks (transitive through its implementation)...

this would allow for powerful expressions of problem specific or architectural constraints patterns - "insert code into this async function body to do some business logic with type system enforced constraints to ensure that the outer system can cancel you and otherwise maintain ability to recover from implementation mistakes" without having to maintain constant vigil on the codebase to ensure that developers aren't "hiding work" in new detached tasks somewhere in the implementation of the specific domain logic ...

I'm generally concerned with the cancellation story offered by this proposal -- I expect vast m seas of code that never check the cancellation status -- making cancellation effectively low reliability/flaky at best ... this might be more of an async/await comment but id almost prefer if every suspension point was also always automatically a cancellation point -- or at least there was pressure to make the majority of suspension points act this way with non cancellable async calls not the default ... async function calls that can't fail generally seem like a very rare thing to me -- i expect folks will paper over failure scenarios to write async function calls that can't fail which will make them generally feel like leaky abstractions in most cases ...

Paulo_Faria · March 24, 2021, 4:02pm

Now that I understand my previous confusion, I agree with you and @xwu. I think my lack of understanding is really a sign that Task.runDetached (what I think is what you mean by "How to opt out of the structured concurrency") needs more thought and maybe merits a separate discussion. I still don't fully understand its main motivation, though. At one point I thought it was meant to allow something like a pre-fork worker model, on a server, for example, but I'm not even sure if that makes sense. I need to re-read the full proposal and try to understand it better. I think this shows that we need a better explanation of the "why"s, not just "what"s. I also agree that we should not rush structured concurrency.

Paulo_Faria · March 24, 2021, 4:15pm

This is how libdill and Venice works. It can throw cancelation errors on most of its APIs, including while creating new Coroutines and when yielding, but more importantly it is guaranteed to throw on any IO operation through its polling mechanisms, which I know I are even lower level in Swift's concurrency proposal.

github.com

Zewo/Venice/blob/master/Sources/Venice/FileDescriptor.swift#L211


      
          ///     Use `.read` to wait for the file descriptor to become readable.
          ///     Use `.write` to wait for the file descriptor to become writable.
          ///   - deadline: `deadline` is a point in time when the operation should timeout.
          ///     Use the `.fromNow()` function to get the current point in time.
          ///     Use `.immediate` if the operation needs to be performed without blocking.
          ///     Use `.never` to allow the operation to block forever if needed.
          ///
          /// - Throws: The following errors might be thrown:
          ///   #### VeniceError.invalidFileDescriptor
          ///   Thrown when the operation is performed on an invalid file descriptor.
          ///   #### VeniceError.canceledCoroutine
          ///   Thrown when the operation is performed within a canceled coroutine.
          ///   #### VeniceError.fileDescriptorBlockedInAnotherCoroutine
          ///   Thrown when another coroutine is already blocked on `poll` with this file descriptor.
          ///   #### VeniceError.deadlineReached
          ///   Thrown when the operation reaches the deadline.
          public static func poll(_ handle: Handle, event: PollEvent, deadline: Deadline) throws {
              let result: Int32
              
              switch event {
              case .read:

I'm curious though, how would that translate into what's being designed for Swift. Where would the low level IO implementation live? In the executor? Would this IO level be aware of tasks and wether they are canceled or not? Would this IO layer be able to throw a cancelation error?

The proposal is really not clear where else cancelation errors could be thrown besides manual checking. It would be nice to clarify all the places where cancelation errors would occur, even if they are to occur in lower level features out of the scope of structured concurrency.

Paulo_Faria · March 24, 2021, 4:43pm

While this works, I might not want to cancel the task that is running the group, just the group. I know the example below creates a detached task to do just that, but then I lose structured concurrency. What if I also want to cancel the group if the task which is running the group is canceled? How would I propagate that?

ktoso:

let handle =Task.runDetached { await Task.withGroup { ... } }
handle.cancel()

// or by simply the parent/child chain of tasks resulting 
// in cancelling the task that is running the group

Like I said in that big post, I still have some questions/suggestions. Specific to this API:

extension Task.Group { 
  /// Add a child task to the group.
  ///
  /// Returns true if the task was successfully added, false otherwise.
  mutating func add(
      overridingPriority: Priority? = nil,
      operation: @concurrent @escaping () async throws -> TaskResult
  ) async -> Bool
}

Instead of returning a Bool it could return an optional wrapping a type that can be used to cancel the child task from outside.

 extension Task.Group { 
  /// Add a child task to the group.
  ///
  /// Returns a cancelable if the task was successfully added, nil otherwise.
  mutating func add(
      overridingPriority: Priority? = nil,
      operation: @concurrent @escaping () async throws -> TaskResult
  ) async -> Cancelable?
}

Cancelable is a placeholder here, but could be that creating a protocol to serve as an opaque cancelable is useful for other things. I think this would solve the problem by allowing cancelation from outside without needing to resort to detached tasks and still maintaing structured concurrency and the following.

Also about the following quote.

Why do you say this is out of the scope for structured concurrency? Is it because the solution you imagine can be built on top of it or is it because it has to, conceptually, in your view? I personally think that "allowing cancelation from outside" does fit structured concurrency, conceptually.

Chris_Lattner3 · March 24, 2021, 7:46pm

While such a thing is appealing, I'm personally skeptical that it will ever be practically possible/useful in a Swift-like language. Such "effect" markers have been proposed for other things, e.g. side-effect-free or doesn't-touch-global-variables. The problem is that these markers become pervasive through the codebase, and such things have large-scale impact on software engineering - e.g. what evolution is possible of an API, so we have to be very careful about them.

We now have two effects in Swift (throws and async), so it isn't /impossible/. The payoff just needs to be carefully considered and the tradeoffs balanced, and I'm personally skeptical that it will make the cut.

-Chris

jayton · March 25, 2021, 12:30pm

I’ve been hesitating to bring this up, but I worry that not having a “detached group”, with a carefully considered relationship to structured concurrency, undermines the whole model.

If we approach SC from the “go statement considered harmful” perspective, runDetached is a go statement which ideally shouldn’t be used, or even exposed. Nevertheless, there are many obvious cases where lexically scoped asynchrony is insufficient, especially in interactive applications: tasks need to be associated with the lifetimes of documents, windows, and pages in browser-like apps, and potentially with service-like objects such as hardware integrations and network services.

As far as I can see, such tasks will need to be detached in the proposed design, unless event loop frameworks are updated to somehow associate a lexical scope with each “lifetimed object”, whatever that is. Retrofitting this kind of design seems difficult given the basic uncomposability of event loop systems.

A “detached group”, or the ability to associate a task with a “lifecycle” object, seems like a natural solution to these kinds of situations, and it isn’t obvious to me that “detached tasks” and “detached groups” both need to exist.

Paulo_Faria · March 29, 2021, 9:28pm

About cancelation errors for IO operations:

About canceling a child task:

@ktoso can you, please, address the two topics I mentioned above? I know you must be very busy, and I don't want to be annoying. Just checking in case you meant to reply later and forgot or something. I appreciate your hard, amazing, work. I just would love to get some clarification on those issues. Thanks!

ktoso · March 30, 2021, 4:02am

That's a task handle.

Yeah returning some struct TaskGroup.Spawned { var successfully: Bool {...}; let handle: Task.Handle<> } could be doable. I'd need to some input from the core team about that though.

We had iterations of this API that included group.spawnWithHandle we can add it again very easily.

I played around with it and quite like it, introduced TaskGroup.Spawned and gave a shoutout to Paulo: https://github.com/apple/swift-evolution/pull/1311#issuecomment-809891145

Lantua · March 30, 2021, 5:25am

If Spawned.succeed matches the nil-ness of Spawned.handle. Maybe we can just return Task.Handle?, and have nil means failure.

ktoso · March 30, 2021, 5:44am

I don't think imbuing a lot of meanings onto nil-ness without spelling it out is a good pattern, it's almost as bad as magic values sprinkled around in the code.

Returning a Spawned has: no more runtime cost than the optional; allows us to spell out if/while spawned.successfully and also offers space for future API evolution; Locking the return type into just an optional does not.

ktoso · March 30, 2021, 6:12am

Paulo_Faria:

ktoso:

I totally think there is space for some “group-like thing that we can add to and cancel from the outside” but it would be something slightly different. Most notably those are useful for “bind the lifecycle of this task-something-container with the lifecycle of this class”, i.e. when the class gets deinit, the related task-collection should as well. I think there is space for such abstraction, but it is not a core primitive like the TaskGroup under discussion — which is specifically for structured concurrency; while the latter one is not.

Why do you say this is out of the scope for structured concurrency? Is it because the solution you imagine can be built on top of it or is it because it has to, conceptually, in your view? I personally think that "allowing cancelation from outside" does fit structured concurrency, conceptually.

I have spent some time with Philippe of Combine designing such abstraction, and as I said here: we agree it is useful. It is not structured though, adding tasks "from the side" without any bound on having to await them, as that is the purpose of such thing, is not really the structured "task group / nursery"-style structured concurrency that this proposal is focused on.

And do keep in mind that you can cancel a group from the outside - by cancelling the task it runs in. So this is about adding tasks to some "container", not just cancelling from the outside.

What I'm saying saying is that this is not something small enough that we'll be able to just tag onto this in-flight proposal but will likely be another new proposal, adding to the set of existing concurrency primitives. That's at least my personal read on it.

Lantua · March 30, 2021, 11:56am

But returning nil upon failure is exactly a Swift pattern... And if we extend any of it's capabilities, it's more likely to be added to Task.handle.

EDIT: correct links

Paulo_Faria · March 30, 2021, 12:46pm

Awesome! Thank you. About throwing cancelation errors when doing IO; I know that it sits in a level below structured concurrency, but it is nevertheless related to the concept of structured concurrency. As far as I understand, most iOS or macOS apps are IO bound not CPU bound. The proposal only mentions mechanisms for cancelation checking and throwing for compute intensive operations with yield(), which only a few people will ever use. On the other hand, IO operations will be where most of cancelation errors will be thrown and this is not mentioned at all in the proposal. I think it makes the whole concept of structured concurrency a bit more difficult to understand as, IMO, that's where it really shines.

This is all related to deadlines. It's not clear to me where deadlines fit. As far as I remember, deadline APIs are not mentioned in the proposal, but I think I've seen something like task.deadline or Task.deadline somewhere. Even if deadlines and IO is out of scope for structured concurrency, I think it should be properly mentioned in the proposal. I also think that the proposal should mention in which proposal deadlines and IO will be properly presented and discussed.

Lantua · March 30, 2021, 12:48pm

group as AsyncSequence doesn't feel exactly useful. There's a recent pitch about end-of-iteration behaviour of AsyncSequence. If we ended up adopting this, this means we can do for ... in group only once at the end. If we want to iterate through the group multiple times, we need to fallback to group.nextResult.

~~What if we use group.currentResults() for AsyncSequence instead?~~ See below.

On a more holistic look on spawn vs add, looking at the PR:

I think it's a shame that we remove back-pressure, it could be useful when scheduling a lot of subtasks. Though it's not necessary since we can achieve it manually:
```
for ... in 0..<10 {
  group.spawn(...)
}
for result in group {
  // handle `result`
  group.spawn(...)
}
```
group.spawn and group.nextResult feels like they're at odd with each other. Task.Spawned exactly identifies one particular handler for one specific spawn, while nextResult anonymizes the origin. If, for example, I can't cancel one particular Spawn because it's already completed, I have no idea which one in the group.nextResult is the corresponding result.

ktoso · March 30, 2021, 12:49pm

The mesh of proposals is interconnected enought already...

We are very interested in deadlines and have made some initial work towards them however they are out scope in the near term. I don’t think it changes much in terms of evaluation of this proposal that in future work includes deadlines. I can add a mention in Future Work, but it shouldn’t impact the review of the core primitives here IMHO.

Deadlines are difficult because Swift has no types to express time concepts in, so we’ll have to tackle that hairy topic along with deadlines. (No, using Foundation types is not an option because of layering).

— edit: one too many negation sneaked in „not out of scope” -> „out of scope”

Paulo_Faria · March 30, 2021, 12:53pm

Nice! I guess the main point I'm trying to make is that mentioning deadlines and IO will make the motivation for structured concurrency easier to understand. Specially when placed alongside yield and the whole IO bound CPU bound dichotomy.