SE-0304 (4th review): Structured Concurrency

Hello, Swift community.

The fourth review of SE-0304: Structured Concurrency begins now and runs through July 14th, 2021.

Following on from the third review, the proposal authors have made a number of changes. The full history of changes made after each review can be found at the end of the proposal, and the full diff can be found here.

Below is a summary of the changes from the prior review. The core team would like to narrow the focus of this fourth review on these areas:

  • renamed Task.sleep(_:) to Task.sleep(nanoseconds:) . This makes it clear that the wait is in nanoseconds, and leaves API space open for a sleep(_:) based on a better duration type in the future.
  • made Task.sleep(nanoseconds:) throwing; it will throw CancellationError if the sleeping task was cancelled.
  • renamed TaskGroup.async and TaskGroup.asyncUnlessCancelled to TaskGroup.addTask and TaskGroup.addTaskUnlessCancelled . The fundamental behavior here is that we're adding a task to the group. add by itself does not suffice, because we aren't adding a value (accessible via next() ), we are adding a task whose value will be accessible via next() . It also parallels the use of Task { ... } to create top-level tasks.
  • renamed TaskPriority.default to TaskPriority.medium , because nil passed in to a TaskPriority? parameter is effectively the default for most APIs.
  • added TaskGroup.waitForAll and ThrowingTaskGroup.waitForAll .
  • renamed Task.yield() to Task.suspend() , which more accurately represents what this operation action does, and leaves the name "yield" for future work on generators.

This review is part of the large concurrency feature, which we are reviewing in several parts. While we've tried to make it independent of other concurrency proposals that have not yet been reviewed, it may have some dependencies that we've failed to eliminate. Please do your best to review it on its own merits, while still understanding its relationship to the larger feature. You may also want to raise interactions with previous already-accepted proposals – that might lead to opening up a separate thread of further discussion to keep the review focused.

Reviews are an important part of the Swift evolution process. All review feedback should be either on this forum thread or, if you would like to keep your feedback private, directly to the review manager. If you do email me directly, please put "SE-0302" somewhere in the subject line.

What goes into a review?

The goal of the review process is to improve the proposal under review through constructive criticism and, eventually, determine the direction of Swift. When writing your review, here are some questions you might want to answer in your review:

  • What is your evaluation of the proposal?
  • Is the problem being addressed significant enough to warrant a change to Swift?
  • Does this proposal fit well with the feel and direction of Swift?
  • If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?
  • How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

More information about the Swift evolution process is available at:

https://github.com/apple/swift-evolution/blob/master/process.md

As always, thank you for contributing to Swift.

Ben Cohen
Review Manager

21 Likes

Looks like this needs to be updated.

Sigh. Thanks @Jon_Shier. Edited to run through July 14.

Should Task.suspend() also throw CancellationError, if the suspended task is cancelled?

2 Likes

The renamings are excellent. They have the exactly the self-explanatory obviousness one wants at the doorstep of a complex API.

In particular, group.addTask is a great improvement. I hope that it maintains harmony with wherever async let lands, since the latter is more or less sugar for the former. (It’s not clear to me what precisely “harmony” means in practice; this is a vague hope.)

I wondered about this too.

It seems clear to me that suspend should not throw: IIUC, suspend is basically equivalent to await asyncFuncThatDoesNothing(). Since await doesn’t throw CancellationError, why should suspend()?

But if suspend doesn’t throw, then why does sleep?

What is the recovery path for a CancellationError thrown from sleep? Is that error just there to remind us that we have the opportunity to handle a cancellation if one has occurred? If so, why doesn’t every suspension point — suspend, and await for that matter — carry that same reminder?

1 Like

Generally speaking, +1 for overall renaming and finetune.

sleep/suspend should both throws CancellationError, since sleep or suspend an already cancelled task make NO sense.

1 Like

A potential benefit of throwing sleep is that it can potentially resume and throw before the sleep times, allowing the task to terminate and free up resources (e.g. strong references to classes/actors).

However, the text as written is ambiguous on whether or not this can happen, which seems dangerous. Early resumption should either be explicitly documented or explicitly disallowed.

(If it’s allowed, that also raises the question of whether there should be a non-throwing version in case people “need” that for some reason. I suggest the answer is no; detached tasks already offer an escape hatch from cancellation propagation.)

2 Likes

sleep() needs some way to indicate whether it completed successfully (ie. waited for the specified time) or not. It seems reasonable that an unsuccessful result is indicated by a throw.

A suspend that checks for cancellation can be done with try await Task.sleep(nanoseconds: 0).

3 Likes

The implementation definitely resumes early on cancellation, so I agree this should be explicit in the proposal. The documentation for sleeps has this text:

   /// Suspends the current task for _at least_ the given duration
   /// in nanoseconds, unless the task is cancelled. If the task is cancelled,
   /// throws \c CancellationError.

+1 I love the changes and renamings! :+1:

For sleep() I would prefer two variants; one that always sleeps for at least the duration, and one that stops sleeping if cancelled.

The thing is that checking for cancellation is voluntary. So you are free to create a task that works deterministically no matter whether it's cancelled or not. But with the proposed API that kind of task then cannot sleep without some sort of loop and time checking. (Edit: Will sleep() throw immediately if already cancelled?)

Also, I think throwing from sleep() is odd, but very convenient.

The name Task.suspend is good because it is easy to understand that it is a suspension point.
However, we can only read that it stops there.
In fact, after a short time, the scheduler will return control and resume.

How about the name suspendAndResumeSoon?

I like the renamings. Thank you for the continuous refinements.

As a minor QoL improvement, would it be possible to add internal names to the closure arguments of with(Throwing)?TaskGroup and withUnsafeCurrentTask? Due to [CodeComplete] Default parameter names of completed closure to internal names by ahoppen · Pull Request #36551 · apple/swift · GitHub closure parameters can be autocompleted.

  func withTaskGroup<ChildTaskResult: Sendable, GroupResult>(
    of childTaskResult: ChildTaskResult.Type,
    returning returnType: GroupResult.Type = GroupResult.self,
-   body: (inout TaskGroup<ChildTaskResult>) async -> GroupResult
+   body: (_ group: inout TaskGroup<ChildTaskResult>) async -> GroupResult
  ) async -> GroupResult { ... }

  func withUnsafeCurrentTask<T>(
-   body: (UnsafeCurrentTask?) throws -> T
+   body: (_ task: UnsafeCurrentTask?) throws -> T
  ) rethrows -> T
4 Likes

In this async world aren’t all suspension points assumed resumption points?

2 Likes

Yes, this is basically the rationale behind Java’s Thread.sleep() throwing InterruptedException. My experience with that method is what prompts my skepticism of Task.sleep here.

In practice, at least in my experience, Java’s Thread.sleep() usually leads to empty catch blocks and/or carelessly rethrown runtime exceptions. There’s rarely a sensible action a caller can take in response to InterruptedException; it’s one of those notorious exception situations where every action is potentially wrong, and the error handling thus ends up deeply embedding hidden assumptions about distant callers.

Now it’s possible that that the parallel construct in Swift won’t be so problematic. I can see several reasons why:

  • Java’s contract for “interrupted” is vague to the point of uselessness. It can mean either “cancel the entire task at hand,” or “stop sleeping early and then continue as normal,” or “stop the current subtask and resume by doing something else.” Behold in horror the tangled mess of advice on Stack Overflow for handling InterruptedException. IIRC, in Effective Java, Josh Bloch recommends never using thread interrupts in Java at all!

    Swift’s proposed contract for cancellation here, while still open-ended, is much clearer than Java’s. Perhaps that is sufficient to fix all the ills above, and ensure that CancellationError always has a reasonable recover path: code can optionally clean things up and/or halt the task early, but simply continuing is always OK (I think?).

  • When “simply continue” is the correct choice, the syntactic burden of try? Task.sleep(…) in Swift is lower than try { Thread.sleep(…) } catch(InterruptedException e) { } in Java.

  • Swift’s errors mercifully are not exceptions, and do not automatically propagate arbitrarily far up the stack.

  • The timing of Java’s sleep is notoriously imprecise, and code that requires precise timing always needs to adjust for actual time elapsed regardless of whether there was an exception. InterruptedException thus is not useful for making timing guarantees. Perhaps Swift’s runtime can make strong guarantees about timing.

Still, despite all these hopeful thoughts it seems sensible to me that we should do a survey of uses of Task.sleep from people who are trying out this API. Is explicitly surfacing the CancellationError actually leading to useful early cleanup in practice? Or does most code end up ignoring it?

I can imagine it might be more sensible to make the always-safe “do nothing and continue” response the default by making sleep not throw, and letting tasks that do have the chance for early cleanup explicitly check for cancellation.

IIUC, it can also be done with:

Task.suspend()
guard !Task.isCancelled else {
  …
}

(which embeds fewer assumptions about the meaning of sleep, and perhaps sheds some scheduling overhead).

Given that, and per the above, I do wonder whether the “task that handles cancellation” pattern shouldn’t be the same for sleep:

Task.sleep(…)
guard !Task.isCancelled else {
  …
}

Early resumption is the whole point of making sleep(nanoseconds:) throwing, so I'll see about making that abundantly clear in the documentation.

Regarding throwing suspend(), I don't think it makes sense: with sleep, we have early resumption when the task gets cancelled, and I think it's important to distinguish between "at least that much time has passed" and "we finished early because the task was cancelled." suspend() isn't like that: it's just giving other code an opportunity to run on the thread. There's nothing more that needs to be reported.

I don't think we should add the variant that always sleeps for at least the duration, for a few reasons:

  1. I doubt it's something that's going to be very common, and it's easy to misuse to create unexpected delays in your async tasks. Please provide use cases if you think it's going to be a useful
  2. It means introducing two APIs with similar names. Then we'd have to go back and do it all again when we get a decent duration type, because "nanoseconds expressed as UInt64" is very much not a good long-term API.
  3. You can trivially build the "always sleeps for the duration" by creating another task and not cancelling it, e.g.,
func sleepWithoutCancellation(nanoseconds duration: UInt64) {
  let task = Task.detached {
    try! await Task.sleep(nanoseconds: duration)
  }
  await task.value
}

Personally, I'm motivated by the ability to create an API that lets you perform a potentially-long-running task, but have an alternative result if the task doesn't complete in some specified duration. That's something one cannot build at all unless you have a sleep that returns early on cancellation and tells you that's what it did. For example:

let image = try await withLongRunningTask(timeout: oneSecond) {
  try await downloadImage(url: URL)
} onTimeout: {
  timeoutPlaceholderImage
}

As an aside, while thinking about the implementation of this, I realized that we're missing UnsafeCurrentTask.cancel(). We should probably add that.

So, I think we have to report whether the sleep operation was completed normally vs. having returned early because the task was cancelled. Throwing requires one to acknowledge the cancellation. We could instead return a Bool to indicate that the sleep completed successfully, which is slightly more lightweight for the cases where one doesn't care how the sleep ended:

_ = Task.sleep(nanoseconds: duration)

If we like the Bool result and we really, really think it's okay to silently return from sleep before the time has elapsed, we could use @discardableResult.

This is even more subtle than a @discardableResult Bool. It's also somewhat racey, because the sleep might have completed and then the task got cancelled afterword. I don't know if this race matters in practice or not, but it seems poor form not to have sleep report how it ended.

This is more of an implementation detail, but sure.

Doug

5 Likes

How about having sleep return a discardable Bool that indicates whether sleep completed normally (was not cancelled)?

I am not sure if it is a good idea to make methods throwing just to report cancellation.

1 Like

Hmm, that is an interesting point. I’m not sure the race does matter: under the cancellation contract proposed, cancellation simply means “optionally clean up early if possible,” and it is the handler’s responsibility to ensure that the task exits cleanly regardless of when the cancellation request arrived. Knowing that it arrived during vs. just after the sleep thus shouldn’t alter the cancellation handling itself.

However, is there an esoteric case where you need to follow a different code path strictly if and only if the minimum sleep time hasn’t elapsed, i.e. where the clock time really matters? Hard to imagine, but if you need it, you really need it, and that’s hard to build safely without the throwing form. So yes, a compelling argument.

I think I’d prefer the throwing form, tbh. The syntactic weight of prefixing _ = isn’t much less than prefixing try?, and a named error is going to make it clearer to clients exactly what it is they’re being asked to handle.

I’d say the bool-returning form with @discardableResult would be preferable if and only if doing nothing on cancellation during sleep turns out to be the behavior in practice, say, ≥75% of the time. It does seem to me it would be sensible to look at what calls to Task.sleep look like in actual usage.

On the whole, though, Doug has warmed me up to the idea that the proposed design is indeed the best one. It’s possible that my concerns above are an instance of “lips once burned on hot broth now blow on cold broth” based on my Java days. So thanks, Doug, as always.

1 Like

From the proposal:

Conventionally, most functions that check for cancellation report it by throwing CancellationError() ; accordingly, they must be throwing functions, and calls to them must be decorated with some form of try . As a result, cancellation introduces no additional control-flow paths within asynchronous functions; you can always look at a function and see the places where cancellation can occur.

So, sleep throwing at least fits with this.

Throwing:

// want to throw
try await sleep(nanoseconds: 100)

// don’t want to throw
try? await sleep(nanoseconds: 100)

Non-throwing, bool result:

// want to throw
try await sleep(nanoseconds: 100) ?? Task.checkCancellation()

// don’t want to throw
_ = await sleep(nanoseconds: 100)

No cancellation result/error:

// want to throw
await sleep(nanoseconds: 100)
try Task.checkCancellation()

// don’t want to throw
await sleep(nanoseconds: 100)

My feeling is that the throwing option is only one that isn’t significantly worse for either option. As long as the documentation states that the only error that can be thrown by sleep is CancellationError, I don’t see needing try? for that case as being too much of a burden.

3 Likes

Ah.. that's true :laughing:

1 Like

I also find await Task.suspend() slightly confusing. To me it seems like I'd be responsible for calling Task.resume() at some point when I'm ready to do more work, as opposed to the system automatically resuming the task whenever it sees fit.

Task.yield() felt intuitive from driving where I'm going to stop, wait, and then start going again when it's my turn. Task.suspend() feels more like parking.

It's wordier, but what about something like Task.offerSuspension()?

Task.relinquish() also seems like a better fit to me, but it's not as discoverable.

3 Likes