What’s the difference between using a TaskGroup vs an array (or dictionary, etc.) of tasks?
I understand a TaskGroup allows retrieving tasks results “continuously” (I get the result of first task that succeed, then the next, etc.), as opposed to an array of tasks where I cannot guess which task will finish first and I have to decide the order in which I’ll await the tasks.
But if I need the results of all the tasks anyway, is there any other hidden benefits from a task group that should make me prefer it over simply storing the tasks in an array and waiting for them in order?
Correct, the defining feature of a group is collecting results in completion order. You cannot write this as efficiently using other techniques (Just Task{} and a stream etc will be heavier memory and scheduling wise than what a group does).
You'll have to do one of the two things if you wanted to not use a group:
spawn one by one, no parallelism:
for work in works {
let t = await work.work()
things.append(t)
}
That is meh since it is not parallel at all. So you might write this instead:
some (unbounded - meh) parallelism
for work in works {
Task {
let t = await work.work()
await self.append(t)
}
}
func append(t: T) async {
self.things.append(t)
}
which is meh for a number of reasons:
unbounded parallelism is meh in general, this just throws all the tasks at the scheduler without much control over how many are in flight at any point in time (whereas implementing such limiting is simple in a group)
you had to use unstructured tasks (Task{}) which are heavier than task group created tasks (group.addTask{} - this is a child task and is very efficient)
you're missing out on structured concurrency entirely; so propagation of task-locals is more expensive in this, as well as there being no guarantee whatsoever that all tasks complete before you "proceed" while tasks in a group keep the group waiting until they all complete
you cannot collect the results on the task kicking off the work easily... so you'll either pay additional hops like shown above, or you'll have to invent your own way to message into an async stream from there...
So... use a group instead for those patterns, it handles them very well
withTaskGroup(of: T.self) { group in
for t in tasks {
group.addTask { await t } // efficient child task
}
for await t in group { // efficiently gathering in completion order
// back in calling task
things.append(t)
}
} // always guaranteed to have drained all the tasks
If you're going to iterate at the end using map anyway, then you'd be better off just doing it upfront and using an array with the correct count, instead of the equivalent of a dictionary, which then has to be sorted.