Hi,
I'm pretty experienced in kotlin coroutines and one thing I'm missing from swift concurrency is guidance on how to treat scheduling/dispatching.
I think I've watched through all the WWDC lectures and as far as I'm aware, there is no such exposed api. The reasoning being that one should not care, which doesn't sit well with me tbh, as it should be true for consumers of the api, however it is not for the providers of the api in my opinion.
Currently, if I'm say, a library developer, who exposes some asynchronous api with completion handler (based internally on GCD), to provide async api I'd wrap it with withCheckedContinuation.
However, it's still run on GCD thread pools. If swift concurrency is to replace GCD, then one needs some sort of api to actually run my library work on. Is it the hidden Executor api, to be revealed in future?
How is one currently to offload stateless blocking work? WWDC lectures lead me to believe to use non main actors, however that in my head is about serializing access to state, not necessarily about dispatching, as the thing I want to off-load could be a pure state less function, say some big graph path finding, so actors don't seem to be right semantically.
Is it just supposed to be this, until the Executor api comes?
In a world without custom executors, it does make sense, IMO, to model complex concurrency problems using existing primitives, like Dispatch, and then bridge the results over to Swift concurrency using continuations.
Using a Dispatch global concurrent queue like this is not wise. I’m hoping that was just for the benefit of a simple example, and in your real code you do better, but I see this a lot in the wild so I want to call it out.
A Dispatch global concurrent queue can [1] overcommit — that is, start more threads than there are CPU cores — which means that it may tolerant this sort of thing more than the Swift concurrency cooperative thread pool. However, this is far from best practice. Specifically, it’s a common cause of thread explosion, which is about as much fun as it sounds. Our general advice is that you avoid concurrent queues in almost all circumstances. For more background on this, see:
Doesn't concurrent queue then contract back if unused?
Tbh, capping it at max core is smart, but with DispatchQueue from my limited knowledge you can either get serial, which is not amazing as well, or unbounded concurrent which as you said can overcommit.
API to limit concurrency here would be most welcome and people would stop using unbonded concurr. queues
I don’t know what your blocking call contains code you don’t have access to or is difficult to change, but ideally it should be a async, non-blocking. For example, a (probably naive) example of asynchronously getting bytes from the standard inputs is polling stdin (synchronous, non-blocking operation) and then doing Task.sleep if no input is available. If you do this in a loop you get an async, non-blocking API for what would otherwise be a synchronous operation.
I don't see a problem of changing it to operation queue:
// do this upfront once:
let operationQueue = OperationQueue()
operationQueue.maxConcurrentOperationCount = 10 // or how many
...
operationQueue.addOperation {
someHeavyStatelessBlockingFunctionTaking10Seconds()
continuation.resume(returning: ())
}
Doesn't concurrent queue then contract back if unused?
You mean the Dispatch thread pool that underlies it? I’m not actually sure if than contracts, but it doesn’t really matter because eventually the threads return to the workloop whereupon they lose the ‘heavy’ parts of their context.
However, this only helps if you’re not killed by the initial thread explosion. At some point Dispatch will refuse to start new threads — refuse to overcommit more than it has already overcommitted — and that can result in some nasty failures.
API to limit concurrency here would be most welcome and people would
stop using unbonded concurr. queues
Agreed. As tera noted, you can use OperationQueue for this but I understand the desire to have a solution in the Dispatch space.
Likewise for Swift concurrency.
Presumably custom executors will hit Swift Evolution at some point, and I encourage you to wade into that discussion.
If it is a bunch of synchronous functions then put them inside an actor A that way it will serialise the execution and won't be on the main thread. If it involves asynchronous functions await on them
to clarify are you talking about when it’s used within a continuation?
Or always?
Both (-:
In that quote I was referring to your specific usage. My concern wasn’t about the context but rather the ‘weight’ of someHeavyStatelessBlockingFunctionTaking10Seconds(). That’s completely inappropriate for a Dispatch concurrent queue.
Stepping back, Dispatch concurrent queues are almost always the wrong answer. In some cases they’re expedient and relatively safe. For example, if you have a very lightweight closure that needs to run somewhere and you don’t care where, it’s generally OK to use a concurrent queue. In most other cases, however, using them is like planting a landmine: Sooner or later someone is going to get hurt.
If you want to understand Dispatch, I strongly recommend that you watch the WWDC videos I referenced above. And if you’re interested in the Swift concurrency side of this, add WWDC 2021 Session 10254 Swift concurrency: Behind the scenes to your playlist. All three are presented by members of the Dispatch team, offering advice based on hard-won experience.
Dispatch serial queues and DispatchQueue.concurrentPerform can both use all cores (if you have more than one serial queue, anyway), and don’t have many of the downsides of concurrent queues.
There’s no one simple answer to “what is the alternative solution?”. It very much depends on what you’re doing with concurrency. The number one thing to investigate is whether your work is CPU bound, I/O bound, or a mix of both. For CPU bound work, DispatchQueue.concurrentPerform(iterations:execute:) is your best option. For I/O bound work, a serial queue, or a small set of serial queues, is typically fine, because you’re only using the CPU to orchestrate your I/O. Where things get complex is when your work is a mix of CPU and I/O work.