How to currently offload stateless blocking work? Hidden executor API?

Hi,
I'm pretty experienced in kotlin coroutines and one thing I'm missing from swift concurrency is guidance on how to treat scheduling/dispatching.

I think I've watched through all the WWDC lectures and as far as I'm aware, there is no such exposed api. The reasoning being that one should not care, which doesn't sit well with me tbh, as it should be true for consumers of the api, however it is not for the providers of the api in my opinion.

Currently, if I'm say, a library developer, who exposes some asynchronous api with completion handler (based internally on GCD), to provide async api I'd wrap it with withCheckedContinuation.

  1. However, it's still run on GCD thread pools. If swift concurrency is to replace GCD, then one needs some sort of api to actually run my library work on. Is it the hidden Executor api, to be revealed in future?

  2. How is one currently to offload stateless blocking work? WWDC lectures lead me to believe to use non main actors, however that in my head is about serializing access to state, not necessarily about dispatching, as the thing I want to off-load could be a pure state less function, say some big graph path finding, so actors don't seem to be right semantically.

Is it just supposed to be this, until the Executor api comes?

private func wrapBlockingWork() async throws {
    return try await withCheckedThrowingContinuation { continuation in
        DispatchQueue.global().async {
            someHeavyStatelessBlockingFunctionTaking10Seconds()
            continuation.resume(returning: ())
        }
    }
}

Is it just supposed to be this … ?

In a world without custom executors, it does make sense, IMO, to model complex concurrency problems using existing primitives, like Dispatch, and then bridge the results over to Swift concurrency using continuations.

Having said that, this is a concern:

DispatchQueue.global().async {
    someHeavyStatelessBlockingFunctionTaking10Seconds()
    …
}

Using a Dispatch global concurrent queue like this is not wise. I’m hoping that was just for the benefit of a simple example, and in your real code you do better, but I see this a lot in the wild so I want to call it out.

A Dispatch global concurrent queue can [1] overcommit — that is, start more threads than there are CPU cores — which means that it may tolerant this sort of thing more than the Swift concurrency cooperative thread pool. However, this is far from best practice. Specifically, it’s a common cause of thread explosion, which is about as much fun as it sounds. Our general advice is that you avoid concurrent queues in almost all circumstances. For more background on this, see:

Share and Enjoy

Quinn “The Eskimo!” @ DTS @ Apple

[1] Whether it will overcommit is a more complex question.

4 Likes

Doesn't concurrent queue then contract back if unused?

Tbh, capping it at max core is smart, but with DispatchQueue from my limited knowledge you can either get serial, which is not amazing as well, or unbounded concurrent which as you said can overcommit.

API to limit concurrency here would be most welcome and people would stop using unbonded concurr. queues

FTM, there's an API for that, albeit a bit higher level than DispatchQueue: OperationQueue.maxConcurrentOperationCount.

I don’t know what your blocking call contains code you don’t have access to or is difficult to change, but ideally it should be a async, non-blocking. For example, a (probably naive) example of asynchronously getting bytes from the standard inputs is polling stdin (synchronous, non-blocking operation) and then doing Task.sleep if no input is available. If you do this in a loop you get an async, non-blocking API for what would otherwise be a synchronous operation.

That's OperationQueue though. How would I set it on DispatchQueue? let queue = DispatchQueue(label: "io", attributes: .concurrent)

This raises a question: why do you need DispatchQueue specifically?

For example in the following class foo and bar are pretty much equivalent:

class C {
	let dispatchQueue: DispatchQueue
	let operationQueue: OperationQueue

	init() {
		dispatchQueue = DispatchQueue(label: "my serial queue")
		operationQueue = OperationQueue()
		operationQueue.maxConcurrentOperationCount = 1
	}

	func foo(callback: @escaping () -> Void) {
		dispatchQueue.async(execute: callback)
	}

	func bar(callback: @escaping () -> Void) {
		operationQueue.addOperation(callback)
	}
}

so, going back to your original example:

    DispatchQueue.global().async {
        someHeavyStatelessBlockingFunctionTaking10Seconds()
        continuation.resume(returning: ())
    }

I don't see a problem of changing it to operation queue:

// do this upfront once:
let operationQueue = OperationQueue()
operationQueue.maxConcurrentOperationCount = 10 // or how many
...
        operationQueue.addOperation {
            someHeavyStatelessBlockingFunctionTaking10Seconds()
            continuation.resume(returning: ())
        }

Doesn't concurrent queue then contract back if unused?

You mean the Dispatch thread pool that underlies it? I’m not actually sure if than contracts, but it doesn’t really matter because eventually the threads return to the workloop whereupon they lose the ‘heavy’ parts of their context.

However, this only helps if you’re not killed by the initial thread explosion. At some point Dispatch will refuse to start new threads — refuse to overcommit more than it has already overcommitted — and that can result in some nasty failures.

API to limit concurrency here would be most welcome and people would
stop using unbonded concurr. queues

Agreed. As tera noted, you can use OperationQueue for this but I understand the desire to have a solution in the Dispatch space.

Likewise for Swift concurrency.

Presumably custom executors will hit Swift Evolution at some point, and I encourage you to wade into that discussion.

Share and Enjoy

Quinn “The Eskimo!” @ DTS @ Apple

Using a Dispatch global concurrent queue like this is not wise.

@eskimo to clarify are you talking about when it’s used within a continuation? Or always?

If it is a bunch of synchronous functions then put them inside an actor A that way it will serialise the execution and won't be on the main thread. If it involves asynchronous functions await on them

Refer: multithreading - Make tasks in Swift concurrency run serially - Stack Overflow

@somu Thats abuse of actors. Actors are about serializing access to some shared resource.

to clarify are you talking about when it’s used within a continuation?
Or always?

Both (-:

In that quote I was referring to your specific usage. My concern wasn’t about the context but rather the ‘weight’ of someHeavyStatelessBlockingFunctionTaking10Seconds(). That’s completely inappropriate for a Dispatch concurrent queue.

Stepping back, Dispatch concurrent queues are almost always the wrong answer. In some cases they’re expedient and relatively safe. For example, if you have a very lightweight closure that needs to run somewhere and you don’t care where, it’s generally OK to use a concurrent queue. In most other cases, however, using them is like planting a landmine: Sooner or later someone is going to get hurt.

If you want to understand Dispatch, I strongly recommend that you watch the WWDC videos I referenced above. And if you’re interested in the Swift concurrency side of this, add WWDC 2021 Session 10254 Swift concurrency: Behind the scenes to your playlist. All three are presented by members of the Dispatch team, offering advice based on hard-won experience.

Share and Enjoy

Quinn “The Eskimo!” @ DTS @ Apple

5 Likes

Fantastic response. Thank you for your knowledge as always @eskimo.

Hi @eskimo,
I have two questions about this:

  1. If we should not use Dispatch concurrent queues here, what is the alternative solution?
  2. You said that we should avoid concurrent queue in almost circumstances. But if we do that we can't not utilize all the cpu cores, right?

Dispatch serial queues and DispatchQueue.concurrentPerform can both use all cores (if you have more than one serial queue, anyway), and don’t have many of the downsides of concurrent queues.

What David_Smith said but also…

There’s no one simple answer to “what is the alternative solution?”. It very much depends on what you’re doing with concurrency. The number one thing to investigate is whether your work is CPU bound, I/O bound, or a mix of both. For CPU bound work, DispatchQueue.concurrentPerform(iterations:execute:) is your best option. For I/O bound work, a serial queue, or a small set of serial queues, is typically fine, because you’re only using the CPU to orchestrate your I/O. Where things get complex is when your work is a mix of CPU and I/O work.

ps I ended up taking one of my earlier posts from this thread and turning it into Avoid Dispatch Global Concurrent Queues.

Share and Enjoy

Quinn “The Eskimo!” @ DTS @ Apple

Speaking of which, does the following make sense to you?

concurrentPerform

"target queue"?!? A call named "concurrentPerform" can execute things serially in some cases?!?

If you target a concurrent queue at a serial queue, yes.

The documentation is stale, the Swift version does not take a queue parameter.

setTarget(queue:) instance method is irrelevant as concurrentPerform is a class method.

I don't see the Obj-C version of that call... looks like the doc was bogus from the beginning.