Revisiting when to use Task.detached

kennyc · June 9, 2022, 1:07am

Throughout the evolution of Swift Concurrency, I have interpreted the documentation and previous WWDC videos as discouraging the use of Task.detached and instead encouraging the use of Structured Concurrency.

However, in WWDC 2022 - Visualize and optimize Swift concurrency Task.detached is presented as the solution to run tasks in parallel. (In this case, for compressing an arbitrary number of files.)

At time 18:59, there is the following snippet:

func compressAllFiles() {
  for file in files {
    Task.detached {
      let compressedData = await self.compressor.compressfile(url: file.url)
      await save(compressedData, to: file.url)
    }
  }
}

Screenshot of slide at 18:59

Are the usage recommendations of Task.detached evolving?

In this example, how does the concurrency runtime handle an arbitrarily large number of files? With OperationQueue, you can have a fixed number of Operations in-flight, does something similar happen with Tasks or will each Task immediately start executing?

If they all start executing, then how should you prevent having an arbitrarily large number of files all trying to be compressed at once? And how does this influence thread starvation if hundreds of calls to Task.detached are made at once?

Previous answers to this question involved making batch calls to a TaskGroup which is why I was surprised to see the use of Task.detached in this example.

QuinceyMorris · June 9, 2022, 4:42am

The details are, well, implementation details and subject to change in the future, but the current understanding is that there's no "arbitrarily large" explosion of work. Currently, Swift uses the main thread/dispatch queue for things running on the main actor, and everything else (everything that runs with "concurrent" execution semantics) runs on the Swift-specific global dispatch queue — the "global concurrent executor" — that can be realized as only a single thread on each physical CPU.

So, if 8 CPUs are available, a maximum of 8 tasks [actually: task fragments] can execute at one time. If a CPU is preempted for something else (such as non-Swift-concurrency GCD apps), the number of concurrent Swift tasks is temporarily reduced. Also, Swift tasks can theoretically block a CPU, and so the number of CPUs that are actually making progress with Swift tasks can be reduced even further, if poorly-written apps are running.

The main downside to having detached tasks instead of grouped child tasks is there's no API to manipulate them as a group. For example, detached tasks would have to be canceled individually, and there's no way of iterating through results as tasks complete.

The main upside to creating detached tasks, I guess, is that they are created with significantly less ceremony (nested braces and boilerplate loops) than grouped child tasks.

It may be true that detached tasks are being emphasized a bit more strongly this year than last year, but if there's any intentional policy around that, I'll leave it for others to jump in about that.

kennyc · June 9, 2022, 5:15am

Is it more accurate to say that you can only make "forward progress" on 8 tasks at once? Isn't the concurrency system allowed to suspend any Task and attempt to make forward progress with another Task?

If there are more tasks than CPU cores, the tasks currently running don't have to run to completion before an "off CPU" task gets a chance to execute.

In the session sample, each task is compressing a file. If there are 500 files to compress, then that for loop is going to get called 500 times and 500 detached tasks are going to be created.

Isn't the concurrency system going to be jumping around between all 500 of those tasks so that each one gets a little bit of CPU time to perform some work?

That for loop is effectively unbounded, which means the number of calls to Task.detached is unbounded. This seems potentially troublesome since the compression tasks include file I/O. Just a few minutes later in that presentation there's a warning to be careful of making file I/O blocking calls within a Task that could be suspended.

I've been assuming that this use-case is a textbook example of when not to use Task.detached because of the unbounded nature of it, but perhaps I haven't been following closely enough.

QuinceyMorris · June 9, 2022, 5:34am

No, that's not exactly — to my understanding — how it works.

A CPU is allowed to jump between threads, to give all eligible threads a chance to make progress in the "normal" multitasking way, but IIRC an executor will only hop between jobs (task fragments) at suspension boundaries — at await-ed statements, loosely.

This means that when all the tasks are good citizens — they have a reasonable quantity of suspension points — they will all be eligible to make progress, 8 (or fewer) at a time, yes.

The point is that Swift concurrency doesn't create 500 threads that compete with each for 8 CPUs constantly. It tries to do only the amount of work that the CPUs can actually do in parallel.

In GCD generally, you can "break" the system via thread explosion, which ends up swamping the CPUs in thread housekeeping and getting nothing useful done.

In Swift concurrency, you can only really "break" the system if your tasks (incorrectly) use blocking synchronous APIs, or if your tasks (dubiously but perhaps not strictly incorrectly) don't have suspensions points to yield at. In the former case, you essentially waste a CPU for a while; in the latter case, you favor one task's progress at the expense of all the others.

Edit: Also, I don't know for sure, but I suspect that the 8 global concurrent executor threads are permanently bound to their CPUs. To move the work around, the jobs can move from thread to thread, but the threads themselves don't hop between CPUs. IOW, as far as Swift concurrency is concerned, there is no thread switching at all. Switching happens at the lighter-weight job level instead, I think.

kennyc · June 9, 2022, 5:48am

And what if they aren't "good citizens"? For example, in the example code it might be making a synchronous call into a system library with no obvious use of await or yield within the task's block.

If a higher-priority Task is created, won't lower priority Tasks be suspended to allow the higher-priority Task a chance to run? If so, then how is the next Task to resume chosen if there are more than 'n' tasks that need scheduling?

That part I understand. If I create 500 detached Tasks, there will be no more than 'n' Threads created where 'n' is the number of CPU cores (or close to it). But I assumed that the concurrency system might bounce around between those 500 tasks in parallel rather than running them to completion before starting another one.

ktoso · June 9, 2022, 5:49am

Our recommendation is not really changing -- reach for detached tasks as a last resort.

The talk's focus in this section was more about showing off the Task moving between the main actor, the global pool, and some other actor and how we can spot such "actor hogging" issues in instruments. For sake of simplicity the easiest way to show off this behavior was to use a detached task. You see this when Harjas moves Task -> Task.detached, it's not wrong - and in the context of this simplified app, it wasn't doing doing much else so hopefully it explained that "Tasks move between actors and the main pool, unless they're bound to a specific actor (like the Task{} was before the change).

Another way to solve this, and not reach for detached tasks is to use a TaskGroup, and put the work submitting done there in the loop, inside of a task group -- this way, the tasks are child tasks, inherit priority and task-locals, and execute concurrently (and OFF the enclosing actor).

That would be a slightly bigger refactor could derail the explanation of the crucial piece the talk is trying to explain at the moment. As one would also have to make the compressAllFiles function async, and bubble up the async-ness a little bit through the app in order to await on the group's result (and have child tasks) - so I can totally see why the presenters opted to keep it simple and focused on the current point being explained.

The recommendation remains to try to use structured concurrency wherever you can still stands.

Hope this helps clarify a bit

// edit 1: typo

kennyc · June 9, 2022, 5:53am

Ok, good to know and thanks for the clarification.

I was hesitant to make this original post because I had a hunch that perhaps the use of Task.detached in this session was more of a "quick hack" to prove another point rather than some recommended technique.

But having spent a reasonable amount of time trying to best understand how to move thumbnail generating code from GCD to Swift Concurrency, this particular example hit close to home and surprised me.

QuinceyMorris · June 9, 2022, 3:11pm

Keep in mind that Tasks are not the unit of scheduling, jobs are. Jobs are the fragments of task code in between suspension points.

In that case, the synchronous call will block that thread, and it won't be available for other jobs. Other threads can be scheduled on that CPU to run non-Task code, but not other concurrent Swift Task code.

The runtime doesn't suspend tasks mid-job. Tasks can suspend only at suspension points, which are the boundaries that define jobs. If a higher priority job comes along, it will of course get preferential treatment when a job ends on the next available CPU's concurrency thread, so a single "bad citizen" won't interfere too much.

Yeah, the amount of progress being made will bounce around between Tasks, but there's no bouncing at the lower level of the jobs. (Well, not for Swift concurrency reasons. Of course, Swift concurrency threads are competing for CPU time with other threads that are not managed by the Swift runtime.)

kennyc · June 9, 2022, 3:52pm

Great, thanks for the insight. Much appreciated.

harjas · June 9, 2022, 4:17pm

Hi! Sorry about the confusion with the WWDC session and when to use Task.detached. As Konrad mentioned, we chose to use Task and Task.detached to demonstrate the performance issue, show it Instruments, and be able to quickly resolve it. Task groups would have been better solution, however, we opted to go with this approach as it allowed us limit the scope of the refactor and make it easier to follow along.

Again sorry for the confusion, hope Konrad and I were able to make things clear!

hooman · June 9, 2022, 5:16pm

I think for the future viewers of the video, it is better to insert a clip to clarify.