SE-0304 (3rd review): Structured Concurrency

Ben_Cohen · May 24, 2021, 10:52pm

Hello, Swift community.

The third review of SE-0304: Structured Concurrency begins now and runs through June 3rd, 2021.

Following on from the second review, the proposal authors have made a number of changes. The changes made after the second review can be found at the end of the proposal, and the full diff can be found here.

Note that this feature is closely tied to the async let proposal, the first review of which is happening concurrently in order to better discuss common aspects such as naming.

This review is part of the large concurrency feature, which we are reviewing in several parts. While we've tried to make it independent of other concurrency proposals that have not yet been reviewed, it may have some dependencies that we've failed to eliminate. Please do your best to review it on its own merits, while still understanding its relationship to the larger feature. You may also want to raise interactions with previous already-accepted proposals – that might lead to opening up a separate thread of further discussion to keep the review focused.

Reviews are an important part of the Swift evolution process. All review feedback should be either on this forum thread or, if you would like to keep your feedback private, directly to the review manager. If you do email me directly, please put "SE-0302" somewhere in the subject line.

What goes into a review?

The goal of the review process is to improve the proposal under review through constructive criticism and, eventually, determine the direction of Swift. When writing your review, here are some questions you might want to answer in your review:

What is your evaluation of the proposal?
Is the problem being addressed significant enough to warrant a change to Swift?
Does this proposal fit well with the feel and direction of Swift?
If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?
How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

More information about the Swift evolution process is available at:

https://github.com/apple/swift-evolution/blob/master/process.md

As always, thank you for contributing to Swift.

Ben Cohen
Review Manager

Alejandro_Martinez · May 25, 2021, 8:00am

I was already very positive about the previous iterations and these last changes made the whole API more consistent in terms of naming. The proposal is also clearer now in terms of documenting which different kind of tasks exists (child, unstructured, detached) which I appreciate a lot.

Overall big +1

nikitamounier · May 25, 2021, 8:10am

I always thought from the get-go that a discardable initialiser with a closure would be the most natural way to create unstructured tasks. I also appreciate the platform-independent naming of the task priorities. This is definitely the best form of the Task API yet.

Big +1

s-k · May 25, 2021, 4:43pm

Overall, I like this proposal very much.

I have a question: In the section Cancellation handlers, it says:

If the task has already been cancelled at the point withTaskCancellationHandler is called, the cancellation handler is invoked immediately, before the operation block is executed.

Wouldn't this mean that the following code from the proposal would complete the download and not cancel it if the task had already been cancelled at the start of the function?

func download(url: URL) async throws -> Data? {
  var urlSessionTask: URLSessionTask?

  return try withTaskCancellationHandler {
    return try await withUnsafeThrowingContinuation { continuation in
      urlSessionTask = URLSession.shared.dataTask(with: url) { data, _, error in
        if let error = error {
          // Ideally translate NSURLErrorCancelled to CancellationError here
          continuation.resume(throwing: error)
        } else {
          continuation.resume(returning: data)
        }
      }
      urlSessionTask?.resume()
    }
  }, onCancel: {
    urlSessionTask?.cancel() // runs immediately when cancelled
  }
}

s-k · May 25, 2021, 5:12pm

And I have a more serious concern:

func gather(first m: Int, of work: [Work]) async throws -> [WorkResult] { 
  assert(m <= work.count) 
  
  return withTaskGroup(of: WorkResult.self) { group in 
    for w in work { 
      group.async { await w.doIt() } // spawn child tasks to perform the work
    }  
    
    var results: [WorkResult] = []
    while results.count <= m { 
      switch try await group.nextResult() { 
      case nil:             return results
      case .success(let r): results.append(r)
      case .failure(let e): print("Ignore error: \(e)")
      }
    }
  }
}

In this code snippet from the proposal, the group.cancelAll() (in addition to the return results) has been mistakenly left out. During the history of this proposal, this exact mistake has been repeatedly made (at least three times) by the proposal authors as well as other members of this forum. Forgetting the group.cancelAll() would have no effect on the behavior of the code, only on its performance. Thus, it is a very hard to find bug.

I am a firm believer that good APIs must make it hard to make mistakes. Therefore, I think it should be an explicit choice for the programmer if the remaining tasks shall be cancelled or not.

(Obviously, in this case, the missing return statement would generate a compiler error. However, it is possible and maybe even likely that this would result in the programmer only adding the return statement to satisfy the compiler and still forgetting the group.cancelAll().)

One possible solution

Changing the return type of operation to an enum so that the return statement would either read return .cancellingRemainingTasks(result) or return .finishingRemainingTasks(result) would be ideal in my opinion, because it would virtually eliminate mistakes as seen above. Maybe there are less verbose options that achieve the same goal.

ktoso · May 25, 2021, 10:15pm

I don't understand your concern here.

It is exactly phrased this way to support immediately cancelling the work:

task is cancelled
task attempts to start the session task, does so
onCancel runs immediately, cancelling the urlSessionTask

You could write this in other ways too: prepare the url task, check for cancellation (and/or store the cancellation handler), only resume the task if not cancelled etc.

Better cancellation aware URLSession APIs may appear in the future as well, so that'd handle that transparently perhaps(?).

Jon_Shier · May 25, 2021, 11:02pm

Could you be more explicit here? I think the confusion is around the order of operations within the nested cancellation handler and continuation. A simple reading line-by-line would seem to have the cancellation handler called before the task is even created, leaving the continuation to create and resume the task after cancellation.

Lantua · May 26, 2021, 3:41am

I just want to confirm. What would happen in task group when a spawned task finishes after the group is cancelled, do they still get added to the group's results and are accessible via next?

It seems to be the case given that we allow tasks to ignore cancellation by design, but I'm not too sure.

ktoso · May 26, 2021, 3:42am

Cancellation that happens after a task completed is not going to impact the task or its result in any way.

s-k · May 26, 2021, 6:05am

Thank you for your response. My concern is the following:

To quote from the proposal:

If the task has already been cancelled at the point withTaskCancellationHandler is called, the cancellation handler is invoked immediately, before the operation block is executed.

To me, this reads like the cancellation handler is called before the operation block if the task is already cancelled. In the NSURLSession example, this would mean that urlSessionTask?.cancel() is called before urlSessionTask is set, making it a no-op. After that, the operation block would create and resume the data task so that it would complete without being cancelled.

Is this correct or have I misunderstood the proposal?

ktoso · May 26, 2021, 6:10am

I see, thanks — I’ll double check but suspect the proposal wording may be wrong here. I have a radar on myself to ensure cancellations always will do the right thing in such situations.

Might be that we retained old wording in the proposal

Lantua · May 26, 2021, 11:31am

Sorry if it's not clear, I mean the case where the cancellation happens when the task group and some spawn tasks are still running, i.e.

Task Group (A) spawns a child task (B),
Task A is cancelled (and so is Task B),
Task B completes normally, ignoring the cancellation,
[here] What is returned if I do group.next() right here?

I think someone could mistakenly think that group.next() would return nil instead of Task B's result (and vice versa) here.

ktoso · May 26, 2021, 11:51am

It will return the result; Nowhere is it implied that a result added to the group is magically dropped.

Tasks always return a result; they may throw or return some “other” result if they notice cancellation but that’s about it.

benlings · May 26, 2021, 9:01pm

I’m glad the Apple-specific names for priorities have been retained as aliases - I’ve found they help clarify when to use a particular level.

Is there any difference between yield() and sleep(0)? This is a question that comes up for other platforms and would be good to clarify.

What’s the behaviour of sleep() if the task is cancelled? Does it wait for the entire duration, or return early? If the former, would it be possible to have an equivalent to that will return/throw early on cancellation? (I know this was mentioned in earlier threads but couldn’t find a response)

JJJ · May 26, 2021, 9:47pm

That's a great point! If you are ignoring cancellation, you definitely do NOT want sleep() to end early, and if you want to react as fast as possible on cancellation you definitely DO want sleep() to end early.

benrimmington · June 1, 2021, 2:55pm

Task Groups

It should be possible to have a single TaskGroup<Success, Failure: Error> type for throwing and non-throwing subtasks — if the group doesn't conform directly to AsyncSequence, but instead has different values properties for Failure == Error and Failure == Never.

However, to avoid ambiguity you'd still need different top-level with{Throwing}TaskGroup functions, so I'm not sure if this is worth changing. Unless you instead have with... static methods for Failure == Error and Failure == Never.

UPDATE: Should the asyncUnlessCancelled methods have @discardableResults?

Task Priorities

For the platform-independent priorities:

.high and .low are fine,
.default could be renamed to .medium (to make the relative ordering more obvious),
.background might be removed, if it can be implied (e.g. in a daemon background process, or when using the BackgroundTasks framework).

The other aliases will impair auto-completion. Can the Dispatch and Foundation frameworks extend TaskPriority with initializers taking QoS parameters?

Voluntary Suspension

The sleep function accepts a plain integer as nanoseconds to sleep for which mirrors known top-level functions performing the same action in the synchronous world.

Darwin.sleep(_:) has a seconds: CUnsignedInt parameter. Which top-level functions accept a plain integer as nanoseconds?

Task.sleep(_:) would be improved with an argument label, and possibly a different parameter type:

extension Task where Success == Never, Failure == Never {
  public static func sleep(forTimeInterval seconds: Double) async
}

This would match the existing Foundation.Thread API, and be compatible with the Foundation.Date APIs. If Double (as an alias for TimeInterval) has 15 significant digits, then it should have nanosecond precision for a range of several days?

Because use-sites look quite explicit in the way they have to prefix this call with an await keyword (await Task.sleep(nanos)), we prefer to use the well-known sleep word rather than introduce new words for this functionality.

Task.yield() could also be renamed. The documentation comments for both APIs are "suspends the current task", so perhaps await Task.suspend() and await Task.suspend(forTimeInterval: seconds).

Can only the current task be suspended, or should it be possible to suspend (and resume) via a task handle?

Douglas_Gregor · June 1, 2021, 6:28pm

benrimmington:

Darwin.sleep(_:) has a seconds: CUnsignedInt parameter. Which top-level functions accept a plain integer as nanoseconds ?

Task.sleep(_:) would be improved with an argument label, and possibly a different parameter type:
extension Task where Success == Never, Failure == Never {
  public static func sleep(forTimeInterval seconds: Double) async
}
This would match the existing Foundation.Thread API, and be compatible with the Foundation.Date APIs. If Double (as an alias for TimeInterval ) has 15 significant digits, then it should have nanosecond precision for a range of several days?

That sleep has an unlabeled parameter specified in nanoseconds is definitely a problem. I'm not sure we want to push the idea of specifying seconds as a Double down into the standard library at all, and doing so for this one API seems unfortunate. Perhaps the easy way out is to use an argument label:

extension Task where Success == Never, Failure == Never {
  public static func sleep(nanoseconds: UInt64) async
}

The API is intentionally ugly, and leaves open space for us to create a nicer sleep(_:) with a proper time type at some point in the future.

Yeah. The AsyncStream proposal and modify accessors consistently use "yield" to mean "produce a value". I agree that suspend is a better term here.

Only the current task. You cannot ask someone else's task to suspend, because they might not be at a potential suspension point.

Doug

benlings · June 1, 2021, 7:30pm

Would combining both into a single API be an option?

extension Task where Success == Never, Failure == Never {
  public static func suspend(nanoseconds: UInt64 = 0) async
}

Or is there a difference between yield() and sleep(0)?

ktoso · June 1, 2021, 9:30pm

Douglas_Gregor:

extension Task where Success == Never, Failure == Never {
  public static func sleep(nanoseconds: UInt64) async
}
The API is intentionally ugly, and leaves open space for us to create a nicer sleep(_:) with a proper time type at some point in the future.

That sounds good to me, yeah.

To be honest I, personally, disagree with the yield used in the AsyncStream proposal (in which the type itself is also IMHO misnamed).

We can name the operation here suspend if we want to, but not because the stream source type uses yield -- as that type actually misuses it IMHO (I don't think a yield should be allowed to fail, which is the core idea of the proposed there API).

benlings · June 1, 2021, 9:41pm

I had a dig into the Task and Dispatch source and it looks like yield() is approximately dispatch_apply_f and sleep is dispatch_after_f. dispatch_after_f calls dispatch_apply_f if when == DISPATCH_TIME_NOW, so sleep(0) would be just a slightly less efficient yield().