Sorry, this is a long question and I am not sure whether I am in the right place asking this, Happy to move it if it is not.
My day job is C++, specifically I am working on FoundationDB. I am interested in learning what the performance characteristics of Swift coroutines and their primitives are compared to C++ coroutines (FoundationDB implements it's own language extension to implement coroutines, but they can also be reimplemented with C++20 coroutines).
Small tangent: as of now I believe Swift async functions are strictly better than C++ coroutines. C++ coroutines are a pain if any kind of visibility is needed, they do more allocations than Swift async functions (so I would expect most use cases to be slightly slower -- in FoundationDB we work around this by providing our own allocators that effectively leak memory), they are almost impossible to debug etc.
However: C++ coroutines give much more control to the user than Swift async functions. This means that it's a bit easier to understand their performance characteristics and, worst case, invest a ton of effort optimizing hot paths.
One such example is the common problem (at least when working on a distributed database) of quorum waits (or waitForAll
which is just a special case of a quorum). The trivial implementation quickly became a huge bottleneck so we optimized this to make it fast (the implementation can be found here for anyone interested).
Now if I understand this correctly, something like a quorum would be implemented in Swift like this:
enum QuorumError: Error {
case MajorityFailed
}
protocol MyServer {
func write(data: [UInt8]) async throws
}
protocol MyService {
associatedtype ServerImpl: MyServer
var servers: [ServerImpl] { get }
}
extension MyService {
func quorumWrite(data: [UInt8]) async throws {
try await withThrowingTaskGroup(of: Void.self) { group in
for s in servers {
group.addTask {
try await s.write(data: data)
}
}
// wait for a majority to finish
var success = 0
var errors = 0
while success < servers.count / 2 {
do {
try await group.next()
success += 1
} catch {
errors += 1
if (errors >= servers.count / 2) {
throw QuorumError.MajorityFailed
}
}
}
// success
group.cancelAll()
}
}
}
(obviously it might be a bit more complicated since in real live you might want to do something with errors even if a majority succeeds, you might want to continue running even after a majority succeeded etc -- but generally I think this is what I would write today).
But what exactly happens behind the scenes here? Since the number of tasks is unknown statically I assume this needs to allocate a new stack for each task? Does each task call into some callback when it is done? Or more specifically: will group.next()
run in O(1)
?
Also: would the compiler do some special optimizations if it knew the size of the group? In FoundationDB code we often do things like wait(timeoutError(someFuture, someTime, someError))
. In Swift I would implement timeoutError
like this:
func timeoutError<T>(duration: Duration,
error: Error,
task: @escaping () async -> T)
async throws -> T
{
try await withThrowingTaskGroup(of: T.self) { group in
group.addTask {
await task()
}
group.addTask {
try await Task.sleep(for: duration)
throw error
}
let res = try await group.next()
group.cancelAll()
return res!
}
}
Wouldn't the compiler be able to run this without doing any allocations (assuming it knows about task groups)? Or would this do the same as the quorum
thing.