I've filed a JIRA with a minimal reproducer on your behalf, but feel free to add some context or motivation for how it affects your particular situation, etc.
Now, I have a larger teaching example, with a lot of comments, to help illuminate what's going on so that you can work around this in your program:
actor A {
func f(_ i: Int) async {
print("task \(i) called A.f()")
}
}
@main
struct Main {
static func main() async {
let a = A()
await withTaskGroup(of: Void.self) { group in
for i in 0..<3 {
group.addTask {
await caller(a, i)
}
}
}
}
}
func caller(_ a: A, _ task: Int) async {
print("task \(task) starting")
// Because this caller function is not isolated to any actor, after completing
// this call to an async actor function, we remain on a's executor, which
// can prevent other tasks from using the same actor.
await a.f(task)
/////
// Now, here are some one-liner tricks to play with. Try commenting,
// uncommenting, or even reordering:
// Temporarily gives up a's executor, but I believe it will try to
// resume on the same executor upon returning? I'm not sure.
// await Task.yield()
// This gives up a's executor and switches to the main actor during the call.
// Similar to a.f(), since we're calling an async function, we won't give
// up the main actor after returning.
// await asyncMainActorFunc(task)
// This one would also give up a's executor during the call, but upon
// returning it will try to switch back to whichever executor it was on prior
// to the call. so, this can still prevent forward progress if it appears after
// a call to an async actor-isolated function.
// await ordinaryMainActorFunc(task)
// this terrible hack should get us off of whichever executor we're on now
// and onto one that is unique, so every task can make progress in this func.
// await DropExecutor().doIt()
///// end of one-liners
// The goal is to have every task make it to `doLongRunningWork`.
doLongRunningWork(task)
}
actor DropExecutor {
var state: Int = 0
func doIt() async {
state = 0 // needed to prevent optimization
}
}
func doLongRunningWork(_ i: Int) {
print("task \(i) starting long-running work")
while true {}
}
@MainActor
func asyncMainActorFunc(_ i: Int) async {
print("task \(i) called asyncMainActorFunc()")
}
@MainActor
func ordinaryMainActorFunc(_ i: Int) {
print("task \(i) called ordinaryMainActorFunc()")
}
To play with the example above, you can compile with:
xcrun swiftc -parse-as-library hang.swift
(just drop the xcrun
if you're on Linux). I particularly recommend starting-off by commenting out all four "tricks". You should see something like this:
task 0 starting
task 1 starting
task 2 starting
task 0 called A.f()
task 0 starting long-running work
which shows that the other two tasks are stuck trying to call a.f()
, but the one task still holding a's executor while doing their long-running work. Next, if you uncomment the line that calls asyncMainActorFunc
you should see something like this:
task 2 starting
task 0 starting
task 1 starting
task 2 called A.f()
task 0 called A.f()
task 1 called A.f()
task 2 called asyncMainActorFunc()
task 2 starting long-running work
Notice that now all three made to a.f
but not any further, because now task 2 is holding the main actor while doing its long-running work. Anytime you uncomment the DropExecutor
hack, you'll see all three tasks will make it to their long-running work:
task 1 starting
task 0 starting
task 1 called A.f()
task 2 starting
task 0 called A.f()
task 2 called A.f()
task 1 called asyncMainActorFunc()
task 0 called asyncMainActorFunc()
task 2 called asyncMainActorFunc()
task 1 starting long-running work
task 0 starting long-running work
task 2 starting long-running work
That DropExecutor
hack is creating a fresh actor instance and calling one of it's async
methods that must be on the instance's executor to update its state. Since each instance has a unique executor, it doesn't matter that each task running caller
continues on that executor after the call. Of course, this hack is terrible; so please closely watch the bug report for a better solution or fix.