As I was wondering the same above, my two cents on this: I don't think adding more completion handlers for this would be ideal. After all, structured concurrency is supposed to help us reduce the completion handler hell we often had to endure when doing this in the past.
I'd like it if ThrowingTaskPool (however it may be named) were configurable in that regard somehow. Perhaps this could be done by passing a simple boolean to withTaskPool and if that's true a thrown error is not the error originally thrown by a Task, but instead some kind of PoolError.multipleErrors([any Error]) case?
I could also envision a way where withTaskPool does not throw at all if it is passed such a collection type by the caller, bit that would probably look weird (as with its signature it still would require a try, I guess).
Just an "error handler" (especially per addTask) is not going to be helpful here. An "error handler" in structured concurrency is the same as in normal structured code: do {} catch {} and we will not be inventing other ways just to "catch" an error.
We could however consider giving this group an error reducer, which would allow making the "what error gets thrown" something the end user has some say in. E.g. it could drop CancellationError as "not interesting" but combine all others in some aggregate error. We would not want to be doing this on behalf of the user though. I should say though that the usefulness of this is rather low in practice... and that it could be added in a future revision as well. Thanks to @John_McCall for the idea provided off-thread.
Technically, we could add an withThrowingTaskPool(of: ..., reduceError: (previousError: Error, latestError: Error) -> Error) and we could implement this efficiently if we had to. This could also be a future addition if we're not sure about it still.
We're still thinking if we'll end up with a new type or can wrangle the TaskGroup into enough flexibility to support these desired semantics... I'll provide an update once we had some time to implement the various approaches
This is reasonable, but I don’t know how to do this without making TaskGroup Sendable or splitting it into two types.
Yes, very much. Auto-cancellation affects unhandled errors only. If you don’t want your server taken down, handle the error. But if you can’t, the only logical pattern is to exit.
This was mentioned in this thread in passing, but to make it more explicit what I think may be a possible solution:
withTaskGroup { group in // add + consume
group.addTask { allowAddingTasksToGroup in // add only
allowAddingTasksToGroup.addTask { ... }
}
}
We could have two protocols for the "write" and "read" sides, and separate the addTask out into the prior. This seems like it could work, I'll look into this some more.
This would perfectly match what I had in mind for that use case. And since it is not the most pressing thing (there are other patterns you can build on top of the proposal as it already is) I would also be perfectly fine to see this in a future addition/revision
I don‘t think it’s correct to say it “affects unhandled errors only.” Auto-cancellation affects errors that are not handled within the Task. I think it’s perfectly reasonable to want to handle errors centrally, outside the individual tasks, and for the client to decide whether the failure of an individual task should cause the failure of all other tasks:
/*
* simple-rsync.swift
*/
enum Command {
case copy(from: Path, to: Path)
case rename(from: Path, to: Path)
case delete(at: Path)
}
struct PermissionsError : Error {
let path: Path
}
func processCommandList(_ commands: [Command]) async {
await withThrowingTaskPool { pool in
for command in commands {
// determine the operation to perform outside the task so we don’t have to copy the Command into the task
switch command {
case let .copy(from: source, to: dest): pool.addTask { try _copy(from: source, to: dest) }
case let .rename(from: source, to: dest): pool.addTask { try _rename(from: source, to: dest) }
case let .delete(at: path): pool.addTask { try _delete(at: path) }
}
}
} taskCompletionHandler: { task /* : Task<Void, PermissionsError> */ in
if case let .failure(error) = task.result {
// like rsync(1), log an error for this operation, but continue with the others
print("Insufficient permissions to access \(error.path)")
}
}
}
}
To my eye, the biggest flaw with the current proposal is that a developer can write a task that doesn’t handle its own errors, and the compiler says nothing. The consequences manifest only at runtime, in the form of cancelling other tasks. “My sever is resetting all of its connections, but my logs only show extremely infrequent errors” sounds like a very confusing SEV to debug!
Edit: Per @ktoso’s comment about Task reducers, I think I can phrase my rsync example in those terms:
func processCommandList(_ commands: [Command]) async {
do {
try await withThrowingTaskPool { pool in
for command in commands {
// determine the operation to perform outside the task so we don’t have to copy the Command into the task
switch command {
case let .copy(from: source, to: dest): pool.addTask { try _copy(from: source, to: dest) }
case let .rename(from: source, to: dest): pool.addTask { try _rename(from: source, to: dest) }
case let .delete(at: path): pool.addTask { try _delete(at: path) }
}
}
} reducingResults: { result /* : Result<Void, any Error> */ in
guard let PermissionsError.failure(error) = task.result else {
// bubble unknown errors up to outer do/catch
return result
}
// like rsync(1), log an error for failing operations, but continue with the others
print("Insufficient permissions to access \(error.path)")
return nil // don’t bubble this error up
}
}
} catch {
// something other than a PermissionsError happened
fatalError("Unhandled error: \(String(describing: error))")
}
}
Only if you opt into this by using the more verbose ThrowingTaskPool variant. If you use TaskPool, the compiler will require you to explicitly handle all errors.
But then you have to handle common errors redundantly within each task, further calling into question the wisdom of having separate TaskPool and ThrowingTaskPool types.
To be precise, my proposal was that we provide a primitive where the tasks all have to return Void and cannot throw, but where tasks have some ability to add new tasks and/or cancel the whole group. That should be a very high-performance primitive where we can instantly destroy tasks when they finish; all of the overhead above the minimum will be around maintaining the current set of tasks as tasks are created and destroyed, which is unavoidable.
We can then implement whatever result merging we want on top of that primitive by creating some temporary lock that tasks grab and merge into before they exit. It doesn't have to be limited to reducing errors. "Remember the first error" can be done with an atomic lazy reference, not a lock, but still doesn't need specific runtime support. Exactly what set of operations we provide in the library on top of the primitive is something I don't have strong opinions about.
I don't think you really want TaskPool to be Sendable either, though, since it should be accessible to child tasks but still shouldn't really be able to escape the hierarchy. It would be interesting to think about how we could make it so that child tasks can submit new work without letting the handle generally escape—@ktoso mentioned that addTask could re-pass a handle for adding tasks into the child task closure, which is a bit boilerplatey but could work.
I think I'd have to see an example that required that central handling, where the same handling could not be expressed in a per-task catch block in more or less identical amounts of code.
That ought not to be true: if you don't use a ThrowingTaskPool, then closures passed to addTask should not be throws and the compiler will force you to handle your errors.
It's not redundant handling if you only write the code once, which it is always possible to do:
I don't inherently object to adding a result reducer, but it adds meaningful complexity to and divergence from TaskGroup. And the default behaviour will be to have no error closure, which falls back to the current behaviour regardless. So I'm unconvinced by the extra cognitive burden of this idea.
Oh, one further downside to the error reducer approach is that you necessarily lose context. The error reducer occurs in a separate context to the throwing task, and so doesn't know anything about the task, such as what its inputs were. All you know is "a thing went wrong", but not what thing, and at that point we're really pretty late in the game anyway.
I should note here that my objection is mostly to using error reducers as the principal error-handling API. Whether they exist in the implementation is a separate question. I'm not even opposed to adding them in the future if there's a clear need for it, but right now I'm just not convinced that anyone will actually use these error filters unless we add a multi-error as the thrown error, which has a number of downsides explained in the pitch.
When designing APIs, defaults matter. This API is intended to act as the root of an app’s concurrency domain, accepting Task submissions with no dependencies between them. Yet the path of least resistance encourages writing fragile servers that introduce a mutual-success dependency between unrelated Tasks.
Clients have to go out of their way to refactor error handling into centralized functions, remembering to call those functions from every single Task submitted to the pool. This imposes an otherwise unnecessary hygiene burden on the entire team, and introduces serious complexity to the recovery path. Recovering from arbitrary cancellation in a parallel operation is far more complex than recovering from partial completion.
At the very least, as an adopter of this API I would expect implicit cancellation to be an opt-in feature, and I would only use it if I created a TaskPool for a specific job in which the Tasks I submit had an intrinsic dependency.
I think we're disagreeing on a fundamental point here. I don't think it's only about the default behaviour though, I think it's about a very core principle, which I will derive from the Zen of Python:
Errors should never pass silently.
Unless explicitly silenced.
I think this point is the crux of the disagreement:
My argument is that implicit cancellation is opt-in: you opt-in to it by choosing the variant that allows you to write a throws sub-task function.
More importantly though, I think adopting your proposed default implicitly forces us to default to silencing all unhandled errors, which IMO makes your code far harder to debug.
This is because if you have chosen the variant that allows you to throw from addTask, then you have forced Swift to decide what to do with that error. It has only two choices: it can either silence that error, or it can throw it out of the TaskPool.
I will point to the above principle and say that silencing the error is the wrong choice, so let's assume we want to eventually re throw it. Because the TaskPool cannot allow the child Tasks to outlive it, there are only two options again: either buffer the error until the TaskPool naturally exits, or bring the TaskPool to a halt as rapidly as possible. I argue that the former is implicitly silencing the error, because TaskPools will often be used with extremely long-lived queues of work, such that the error may never be seen in the lifetime of the program. It seems to me the only behaviour that doesn't have the effect of defaulting to "silence all errors" is to cancel the TaskGroup.
Now, you have proposed up-thread that an alternative is to have a specific handler function that is forced to process the errors. I think this fundamentally defeats the structured pattern that we're trying to enforce. Errors currently propagate upward, exactly as they do in straight-line code. This is good. It's easy to understand and easy to follow. Yes, in rare cases it will cause duplication of error handling code, just as it does today in straight-line code, but I simply do not think a clever solution is required here. Swift has well-tested, well-understood error handling patterns today. This problem is entirely solvable using the existing error handling patterns that Swift provides.
I don’t think that’s true. The throwing task pool can still rethrow any errors which aren’t explicitly silenced. It’s just that throwing an error into the enclosing task pool shouldn’t cancel other tasks if the owner of the task pool wants to silence the error.
The owner of the Task Pool has a way to spell that idea, as they have always had: use a try...catch block.
To be clear, I'm not saying that the "error reducer" idea is definitely bad. What I'm saying is that it's a solution in search of a problem, and if we introduce it we can never take it back. Given that TaskGroups already have a way to handle this problem (try...catch), and that that solution is idiomatic in Swift, I think we should prefer to lean on that until we have existence proofs of a problem.
The underlying implementation should absolutely be built in a way that allows us to add an "error reducer" after the fact, but I think putting it in right now adds syntactic burden that very few users will ever actually want.
How does the owner of a task pool use a try/catch block to avoid implicit cancellation of tasks whose bodies they don’t control?
If your answer is “don’t let anyone else submit directly to the pool”, my response is that the same argument could be used to make the pool owner responsible for the cancellation behavior by wrapping all incoming tasks in a do { try incomingTask() } catch { pool.cancel() } block. And that since “explicit is better than implicit”, this is the less surprising approach.
Yes, that's my answer, and yes, you can make that counter argument. If we only want to add TaskPool and not ThrowingTaskPool, then I can trivially build most of the behaviour I want:
There are three reasons why I think we should elevate this pattern to API. Firstly, because it's not uncommon to want the implicit cancellation behaviour. TaskGroup already has it if you throw out of the TaskGroup body, and other examples of structured concurrency also default to this pattern (e.g. Python's Trio). We have prior art that this is a good idea.
Secondly, we should elevate it to API because if you want to persist the errors then life is much harder than just the above function. You need to introduce a wholly new type that doesn't just encapsulate the existing TaskPool (quite difficult to do when the TaskPool is passed to the body inout) but also that contains a thread-safe implementation of error tracking. It seems a bit merciless to force users to do this for TaskPool when they are not asked to do it for TaskGroup.
And finally we should elevate it to API because it doesn't change a default. withThrowingTaskPool is syntactically heavier than withTaskPool. The API encourages you not to use the throwing version. You have to decide you want the version that defers the choice for unhandled errors up to Swift. No-one is making anyone use ThrowingTaskPool if it isn't what they want.
I don’t disagree with having implicit cancellation as API; I disagree that the current proposal makes the difference sufficiently clear.
There are many functions in the standard library where clients have to use the rethrowing version in case the closure they pass might throw. But the API as proposed conflates the throwingness of individual tasks (a type-level property) with whether throwing from one task cancels other tasks (a runtime behavior). The client is forced to either squelch errors entirely or accept the implicit cancellation. I am asking to replace this false dichotomy with an option. I phrased it as an error filter up-thread, but I would be satisfied with withTaskPool(implicitlyCancelling: Bool = false).
I understand, but I disagree that this is a false dichotomy. A thrown error that is caught by the TaskPool is either rethrown or leaked. We can have (and are having) a discussion about putting in a filter, which is an idea that has merit, though we disagree on the spelling. But I do not think we can just turn off implicit cancellation: if we do, the error is essentially squelched silently, for all the reasons I outlined above.