Right now I have project with mixed Tasks and legacy GCD threading code. Problem is that after some time, on some devices, we observe something similar to deadlock, code that perform task + GCD code are just stops performing, like it's waiting to some other thread finish.
I've found related thread:
Deadlock When Using DispatchQueue from Swift Task
And now I am wondering, what I can do to detect this kind of deadlock, using Xcode?
To detect this problem, I might profile the app (â+i ) in Instruments. E.g., I might use the âTime Profilerâ template and then manually add the âSwift Actorsâ and âSwift Tasksâ tools, too. Then, when you run the app, you can use the âAlive Tasksâ lane in the âSwift Tasksâ tool, to identify situations where tasks were started and weren't allowed to finish. The âAlive Tasksâ channel should always drop back down to zero when the app has reached quiescence. If it has not, youâve got some deadlock somewhere.
You might also sprinkle the code with OSSignposter
instrumentation (e.g., replacing the print
with signposter.emitEvent
and for these tasks that take a little time, add a signposter.beginInterval
and the start and a signposter.endInterval
at the end:
import os.log
let poi = OSSignposter(subsystem: "Subsystem", category: .pointsOfInterest)
final class BarrierTests {
func test() async {
let subsystem = Subsystem()
await withTaskGroup(of: Void.self) { group in
for index in 0 ..< 1000 {
// by the way, to fix the problem in this example, uncomment the following line; this will constrain the concurrency and solve the problem here
//
// if index > 4 { await group.next() }
poi.emitEvent(#function, "adding task \(index)")
group.addTask { subsystem.performWork(id: index) }
}
await group.waitForAll()
}
}
}
final class Subsystem {
let queue = DispatchQueue(label: "my concurrent queue", attributes: .concurrent)
func performWork(id: Int) {
if id == 0 { write(id: id) }
else { read(id: id) }
}
func write(id: Int) {
let status = poi.beginInterval(#function, id: poi.makeSignpostID(), "\(id)")
poi.emitEvent(#function, "schedule exclusive \(id)")
queue.async(flags: .barrier) {
poi.emitEvent(#function, " execute exclusive \(id)")
poi.endInterval(#function, status)
}
poi.emitEvent(#function, "schedule exclusive \(id) done")
}
func read(id: Int) {
let status = poi.beginInterval(#function, id: poi.makeSignpostID(), "\(id)")
poi.emitEvent(#function, "schedule \(id)")
queue.sync {
poi.emitEvent(#function, " execute \(id)")
poi.endInterval(#function, status)
}
poi.emitEvent(#function, "schedule \(id) done")
}
}
I would also advise using the âHangsâ tool (which is part of this âTime Profilerâ Instruments template. Personally, I always reduce the âReporting Thresholdâ down to âInclude All Potential Interaction Delays (>33 ms)â.
Regarding, that other question is the combination of the thread explosion (more than 64 threads), the limited threads of the cooperative thread pool, and that TaskGroup does not guarantee FIFO behavior. The most compelling solution is to always avoid unbridled parallel execution. (See in my example above where I have a line that I can uncomment inside the loop creating the tasks, to constrain the concurrency to no more than four at a time.)
And if youâre using a reader-writer pattern, like that other question, consider retiring it; itâs one of those patterns that feels like it should enjoy great benefits, it (a) is almost always slower than locks; and (b) frequently introduces problems in thread-explosion scenarios.) I know it looks like the concurrent reads should offer great benefits, but it also always does not.
Thank you!