Help migrating GCD patterns to async/await

I'm in the position of working on a macOS application that can take advantage of async/await but I'm struggling to understand how to migrate some GCD design patterns that I currently use over to async/await.

After years working with GCD, most of the concurrency in my applications is implemented in "Services" that accept "Requests", asynchronously process them and then return "Responses" by means of a completion handler, Combine publisher, delegate or Notification Center.

In almost every Service I've built, I've found that I need to manually throttle the number of concurrent requests being processed, otherwise performance will degrade and either GCD or the system will get overwhelmed. However, I don't see how to manually manage the number of concurrent tasks using async/await.

Consider the use case of rendering thumbnails, which incidentally was a use case promoted in the videos. In a macOS application, I might be rendering thumbnails of images on disk into a grid view. It's very likely I'll be rendering dozens or more as the user selects different folders to view. (Heck, on an XDR display, it's possible to show well over a hundred thumbnails at once.)

Even though I want to render 100 thumbnails asynchronously, I don't want them all being "worked on" at once. In previous application, one of three strategies is used:

  • Use OperationQueue with maxConcurrentOperationCount set to something.
  • Use a serial queue that feeds into a concurrent queue with a semaphore to manage the number of inflight requests running on the concurrent queue.
  • Use a DispatchSource that pulls requests out of an array and processes request in "chunks".

While completely arbitrary, the number of concurrent requests I tend to have inflight is generally the value of ProcessInfo.processInfo.activeProcessorCount.

Two Queue Example:

public func submitRequest(_ request: Request) {
  serialQueue.async {
    self.inflightSemaphore.wait()

    self.concurrentQueue.async {
      let response = request.perform()

      self.inflightSemaphore.signal()
      // Publish response...
    }
  }
}

Using async/await, how can I achieve similar levels of concurrency? Continuing to use semaphores within async functions seems like an anti-pattern for async/await. What other patterns or features are available?

Even though I want to render 100 thumbnails asynchronously, I don't
want them all being "worked on" at once.

Or do you? Keep in mind that async tasks are much lighter weight than threads and that the thread pool that runs these tasks is cooperative (with regards that last point, if you haven’t watched WWDC 2021 Session 10254 Swift concurrency: Behind the scenes you should). So, it might be OK to use a lot of tasks, depending on your exact requirements. Specifically:

  • Is your rendering entirely CPU bound? If not, using tasks isn’t wise, because you’re holding down a cooperative thread waiting on I/O.

    Note Keep in mind that I/O could include “talking to the GPU” (-:

  • Is rendering each thumbnail reasonably fast? If not, you may want to insert a yield to avoid hogging the cooperative thread for too long.

If the current task model isn’t suitable then, yeah, things get more complex. Long term I suspect that the answer will lie in a custom executor (see Support custom executors in Swift concurrency). In the absence of that, I think you’ll end up needing to use one of the existing primitives for your core concurrency.

One thing to keep in mind here is that you can still model this as an async function for the benefit of your clients. withCheckedContinuation is your friend (-: (see SE-0300 Continuations for interfacing async tasks with synchronous code).

Share and Enjoy

Quinn “The Eskimo!” @ DTS @ Apple

3 Likes

It's certainly been my experience in the past that manual throttling has been needed to achieve a good user experience. Many of my use cases involve disk I/O and the GPU, either directly or indirectly.

There's also a lot of use cases that involve calling into libraries are frameworks where cancellation is not really an option. For example, decoding an image using ImageIO can't really be cancelled. If you fire off 100 hundred requests to generate thumbnails, then you can't really cancel any of them until at least ImageIO returns.

In previous WWDCs, sessions on GCD seemed to emphasis modelling concurrency in your app using a small number of serial queues centered around "services" or "modules". Each serial queue effectively limited the number of concurrent tasks for each service.

I can think of a lot of use cases where I want "more than one" task to be running in parallel but not an "unlimited" number.

With async/await, it seems like it's gone the other way. I can definitely appreciate that async/await is different than GCD and might be able to support a lot more concurrent tasks, but in use cases where you really do want to throttle stuff, is there a recommended pattern with what's currently available?

in use cases where you really do want to throttle stuff, is there a
recommended pattern with what's currently available?

No. While you can throttle work, you have to implement that throttling yourself using existing primitives. Once you’ve done that you can integrate it with Swift concurrency via withCheckedContinuation and, in the future, hopefully, custom executors, but that’s not really the complete solution you’re looking for.

And, as we’ve discussed before (IIRC on DevForums), this throttling is not easy to do even in the Dispatch world. Dispatch has good support for parallel computation (dispatch_apply) and good support for serialisation per-subsystem, but it doesn’t have a great option for running mixed computation and I/O tasks where the execution time of those tasks can vary wildly from task to task. Ultimately this problem boils down to the queue ‘width’ [1]. If you choose a value too small, you run the risk of bubbles if all the tasks block waiting for I/O at the same time. If you choose a value too large, you run the risk of thrashing the physical cores between a large set of computation-bound worker threads.

In short, Swift concurrency doesn’t make this problem any better )-: or any worse (-:

Share and Enjoy

Quinn “The Eskimo!” @ DTS @ Apple

[1] The number of serial queues you allocate to this, or the number of tasks that you let loose on to a concurrent queue.

As always, thanks for your insight and input, much appreciated.

I'll continue to play around with the new async/await stuff to see how I can incorporate it. I really like the syntax and how it can help clean up some code at the call site.

However, the vast majority of work that I want to perform asynchronously in my apps involves things like:

  • Batch copying or moving files.
  • Batch exporting images and videos.
  • Batch rendering of thumbnails.

I also do a lot of asynchronous work through XPC services. It will take time for me to figure out where I can transition over to the new stuff.

Batch copying or moving files

For the core copying function you may be able to make progress by wrapping Dispatch I/O [1]. For anything related to metadata, async/await is not going to be a good match because the underlying file system operations are synchronous all the way down to the kernel (and even within the kernel) and thus your calls may end up blocking a Swift concurrency cooperative thread.

I like to think about this in the same way I think about networking. Imagine your file system operations are actually targeting a network volume, and thus any operation could block for dozens of seconds. How would you code react to that?

Batch exporting images and videos

Batch rendering of thumbnails

These are likely to involve system APIs, and so whether async/await will be useful depends on whether the system APIs are themselves async.

I also do a lot of asynchronous work through XPC services.

Now that sounds like a good candidate for async/await. I’ve not played with it yet myself, but the fact that XPC is very request/reply oriented, with the reply being delivered by a completion handler, aligns very well with the async/await model.

Share and Enjoy

Quinn “The Eskimo!” @ DTS @ Apple

[1] One thing to note here is that read and write file system calls are also fully synchronous, just like their metadata cousins (things like aio, see its man page, just synthesise asynchrony using threads). However, it might still be worth experimenting with Dispatch I/O because then Dispatch is taking care of the work balance.

2 Likes

This makes me long for io_uring for Darwin. Asynchronous system calls are a thing really :slight_smile:

2 Likes

Just curious, macOS and iOS don't have asynchronous I/O system calls?

I had originally thought the same thing, but when I make use of XPC services, I tend to want to send progress reports back to the host application and support some method of remote cancellation. To achieve that, I generally use the originating XPC connection in a bidirectional way or I create a back channel. The reply handler is really only used to keep the connection's priority boost alive.

For the other use cases, I'm willing to accept that some tasks might be bottlenecked by system functions that are not asynchronous. Exporting a batch of images or videos, for example, will cause contention on various system resources. However, from the user's perspective, it can be beneficial to have operations running in parallel even if the total time for completion is longer.

Sliding a bit off track for this forum, I suspect...

Going back to the original question, I came across this Gist of how a parallel_map might be implemented:

There is some really interesting stuff in here that I had not previously been aware of. Namely the use of TaskGroup.add and TaskGroup.next. This doesn't completely solve the original problem, but it's a lot closer than I got on my own.

Question:

In that implementation, there is a nested func submitNext(). Could that be pulled out into some sort of Actor that vended "requests"?

I'm envisioning an implementation where a "service" is an Actor that manages a queue of pending requests. A second class, like a "ServiceRunner" could just be sitting in a loop similar to the one in the Gist above, waiting for the next request to be vended....