SE-0381: discardResults for TaskGroups

Hello, Swift community!

The review of SE-0381 "discardResults for TaskGroups" begins now and runs through December 29th, 2022.

Reviews are an important part of the Swift evolution process. All review feedback should be either on this forum thread or, if you would like to keep your feedback private, directly to the review manager by email. When contacting the review manager directly, please keep the proposal link at the top of the message and put "SE-0381" in the subject line.

What goes into a review?

The goal of the review process is to improve the proposal under review through constructive criticism and, eventually, determine the direction of Swift. When writing your review, here are some questions you might want to answer in your review:

  • What is your evaluation of the proposal?
  • Is the problem being addressed significant enough to warrant a change to Swift?
  • Does this proposal fit well with the feel and direction of Swift?
  • If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?
  • How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

More information about the Swift evolution process is available at:

https://github.com/apple/swift-evolution/blob/main/process.md

Thank you,

Doug Gregor
Review Manager

18 Likes

I think this design is a more elegant solution than the original alternative of a parallel TaskPool/ThrowingTaskPool; the motivating problem is definitely significant enough to warrant such a change and I'm glad that it'll be so much easier to get the desired behavior with this option.

My one question is whether discardResults on ThrowingTaskGroup needs to have the exact same name or could (given the additional behavior outlined in the proposal) have some pithy clarifying thing to signify the cancellation behavior keeping only the first error—something in the spirit of but more elegant than discardResultsKeepFirstError, for instance.

5 Likes

Whacky alternative idea: for Void result-type groups, keep a count of finished child tasks instead of storing them, and still have the group be able to send Void values as each child finishes. Would that be practical?

4 Likes

Thanks for the idea, yes that was actually our initial implementation attempt, but for better or worse, it would not work. The original implementation idea was to do that, and we could support next() by "making up" the Void() values as they're indistinguishable from "the real Void() values", all we need is to keep the counts. The problem with this however is that... we need to keep the counts in case someone MAYBE calls next() someday.

This is a problem because this shape of task group is intended to "potentially run forever".

The TaskGroup already uses a single atomic "status" 64 bit integer to orchestrate all of its synchronization and counts. The status looks like this:

[ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 64 bits ~~~~~~~~~~~~~~~~~~~~~~~~~~~~]
[1:cancelled][1:waiting][31:ready ~~~~~~~~~~~][31:pending ~~~~~~~~~]

Now... if we're going to be running "forever", like in those server scenarios where we need this, we'd need to keep those counts up to date, in case someone were to call next(). However... 31bits of ready bits are not enough to ignore the potential for overflow (!). (On the other hand, usually software like this often assumes saturating 64bit is impossible, but 31 sadly is possible). In other words, this shape of API cannot support (without some "overflow counter" where I'd prefer not to go), such "forever but someone might call next()".

Thus, the group of this shape, must ended up dropping the ability to next(); which actually was a goal to begin with, we actively don't want to spin and wait on next() in these group usecases.

The API shape where we say that next() always returns nil repurposes the ready bits into a larger pending count; and the ready cannot ever overflow, since we never count them.

8 Likes

Big +1 from me, I think this is a clear step forward and leaves the door open for more advanced features (like error handling) as specific needs become clearer.

Regarding parameter naming:

While I agree it'd be nice to make that behavior explicit in the signature, I think embedding this additional information in the parameter name is probably overkill, and also makes extending the error handling a little more difficult later down the road. Completely bikeshedding what the API could look like for more advanced error handling, there's a pretty easy path forward from:

withThrowingTaskGroup(of: Void.self, discardResults: true) { ... }

to something like:

withThrowingTaskGroup(of: Void.self, discardResults: true, errorHandling: .keepFirst) { ... }
withThrowingTaskGroup(of: Void.self, discardResults: true, errorHandling: .aggregateAll) { ... }
withThrowingTaskGroup(of: Void.self, discardResults: true, errorHandling: .aggregateMatching { /* some `filter`-esque closure */ }) { ... }

I just worry that if we start naming things related to error handling without knowing what the requirements are there, we might end up with a pretty awkward API surface when it eventually gets fleshed out. My opinion might be different if there didn't seem to be consensus that some additional error handling features will be required, but everyone seems to agree that it's needed (just not on exactly how yet).

I think everything in the meantime can be sufficiently addressed with documentation.

3 Likes

I prefer the design where this was a different type.

With this design, a user can iterate the task group, and see no child tasks, even though there are, in fact, child tasks being executed. That just feels clunky. If this kind of group does not support iterating the child tasks, a better API would just not make that operation available.

So I prefer the design with a separate type. Failing that, I would prefer if attempting to iterate a task group which discards its results triggers a runtime error, clearly distinguishing "this group has no child tasks" from "you're performing an invalid operation".

9 Likes

Crash on next()

A runtime error on using next() is an idea, but sadly it'd have to be a fatal error, which seems a bit harsh... but perhaps is indeed an option; I could see the point here that using next() is misusing and not understanding the option, so perhaps we should aggressively prevent it, rather than silence the issue.

Why not the new type, after all?

The new type would be somewhat cleaner, since there is no operation that is effectively disabled anymore, but in discussing this the problem of burning this type name for a very very similar type to a group was raised. So the primary reason to avoid the new type, is mostly driven by the need to avoid burning the type name on something so similar, while we might want to have other kinds of task pools in the future, which may or may not fit these semantics (e.g. maybe they'd be unstructured etc, we don't know yet).

Hopefully this gives some background to the revision towards an option, from a new type which it was in the initial pitch.

5 Likes

Nice proposal! :+1:

Though also think that types are better and makes design clearer, but thx for explaining.

Would it make sense to restrict this option to task groups of type Void? That way the compiler would be able to warn if a non-@discardableResult function call is performed as the only statement in the task closure.

1 Like

I haven't followed previous discussions so take this with a grain of salt but I do feel like even tho the scenario is worth designing a solution for, this extra complication added to TaskGroups feels a bit too much, for something that is already complex to understand.

One thing I'm not clear on is if the "auto await at the end of the group" behaviour is still in this mode.

I don't think it is, which means that unless you have a awaiting loop (like in the examples) to keep the taskgroup alive, the group will finish immediately after adding some child tasks?

1 Like

Thanks for the explanation why keeping a count is undesirable.

With the caveat that I find the proposal reasonable and think it would be fine as-is, I dislike some aspects of this, similar to other posts in the thread:

  1. The prospect of next() being available when it will only ever return nil. In fact, make that the whole AsyncSequence conformance being available but unusable.
  2. The generic parameter being effectively unused. If we can't return values in this case, it would be better to require Void or Never for the generic parameter, I think. addTask<T> and addTaskUnlessCancelled<T> variants could be added for a TaskGroup<Void/Never> extension too so we can take child tasks of different result types, given we will discard the results anyway.
  3. As a follow-up on 2, I think Never for the generic parameter may be (very marginally) better at communicating intent here than the new argument. I'm on the fence about this, to be honest.

Never wouldn’t make sense because that would imply that all the child tasks run forever or kill the entire process on exit, and i think the expectation here is that the child tasks would either return normally or throw an error.

i agree with all these sentiments and would personally have voted for a separate type because this is for solving a different problem that what we usually do with TaskGroup which is collecting results out-of-order and combining them.

on the other hand i would really hate to be @lukasa and @ktoso having rewritten the whole proposal because a few of the reviewers in the other thread wanted it to be the same type. i didn’t really pay attention to the pitch thread because i have the same task pool problem right now and i spent the week trying to implement my own solution with unstructured tasks and an atomic TaskCounter and it took me a few days to give up on unstructured tasks and come back to this proposal. so it is not the end of the world if this ends up becoming another axis of TaskGroup instead of its own type. i would be happy with any structured way to accomplish this.

Special case overloads of addTask[…] would be included when the group’s result is Never, taking child tasks of any result type. This no-values-passed-along behavior is a special case and associating it with Never would highlight that. Just a thought.

everyone is different but i think if i am returning anything other than Void (including Void?) i really don't want that to click with a TaskGroup.

This behaviour is still on.

4 Likes

+1

Are there any production use cases where we would want to keep the results for void?

1 Like

I think it's reasonable to restrict the generic type parameter to Void.

6 Likes

How about adding an overload where Value == Never where the user can return any value they want?

Confirming two things:

  • Offering this to ChildTaskResult == Void sounds good to me; it's not like you can do anything with the results of child functions after all anyway.
  • This is still a TaskGroup, so it does all the things expected of it as a structured concurrency primitive. So yes, it always waits for all children to complete.
    • this is safely implementable, unlike a next() method -- so the group.waitForAll() still exists and works as expected.

And one note about a minor syntax error in the proposal that I noticed while implementing and that we'll address in the proposal text:

public func withThrowingTaskGroup<ChildTaskResult, GroupResult>(
  of childTaskResultType: ChildTaskResult.Type,
  returning returnType: GroupResult.Type = GroupResult.self,
  discardResults: Bool = false,
  body: (inout ThrowingTaskGroup<ChildTaskResult, Error>) async throws -> GroupResult
) async rethrows -> GroupResult {

this has to be throws since there never is a next() that causes the throw to happen, so it must be:

public func withThrowingTaskGroup<ChildTaskResult, GroupResult>(
  of childTaskResultType: ChildTaskResult.Type,
  returning returnType: GroupResult.Type = GroupResult.self,
  discardResults: Bool = false,
  body: (inout ThrowingTaskGroup<ChildTaskResult, Error>) async throws -> GroupResult
) async throws -> GroupResult {

cannot tie this throw-ness to "was the boolean true?".

Good catch. This also means we can no longer just add discardResults as a new parameter to the already existing function, but need to introduce a completely new function. It will also only make sense to call the new function with discardResults: true. With that, I'm wondering if it would make more sense to give this function just another name and remove the discardResults parameter as propose during the pitching phase by @hisekaldma. Otherwise it will be a bit confusing to see two very similar looking methods with quite different behaviour but only if the one of the parameters is true.

1 Like