Deadlock When Using DispatchQueue from Swift Task

I have a subsystem that is using a reader-writer scheme using barriers to manage access to a shared resource. When I use said subsystem from a TaskGroup it often deadlocks.
The code basically boils down to the following (implemented here in an XCTest):

import XCTest

final class BarrierTests: XCTestCase {
    func test() async {
        let subsystem = Subsystem()
        await withTaskGroup(of: Void.self) { group in
            for index in 0 ..< 100 {
                group.addTask { subsystem.performWork(id: index) }
            }

            await group.waitForAll()
        }
    }
}

final class Subsystem {
    let queue = DispatchQueue(label: "my concurrent queue", attributes: .concurrent)

    func performWork(id: Int) {
        if id == 0 { write(id: id) }
        else { read(id: id) }
    }

    func write(id: Int) {
        print("schedule exclusive write \(id)")
        queue.async(flags: .barrier) { print(" execute exclusive write \(id)") }
        print("schedule exclusive write \(id) done")
    }

    func read(id: Int) {
        print("schedule read \(id)")
        queue.sync { print(" execute read \(id)") }
        print("schedule read \(id) done")
    }
}

From the TaskGroup it schedules one Task that dispatches an async barrier on a concurrent queue and all other tasks are using DispatchQueue.sync. What I see very often is that pretty much immediately the cooperative queue that is executing the tasks is filled and the barrier async block is never executed even though it is dispatched to a different queue.

I understand that there may be different ways to do this, like using a serial queue instead of a concurrent queue, and that is true and it works.
This subsystem has worked for years and for the sake of argument may even be one that I don't have the sources for and therefore the ability to change.

Is this expected behaviour? Why is the thread dispatching the async barrier never created/scheduled? Do cooperative threads share a common thread pool with non-cooperative threads?

3 Likes

I didn't look past queue.sync – that's a recipe for deadlocks unless you are inhumanely careful. I'd switch it to async + callback, which it turn could be switched to async await.

fyi, same question was asked there: Deadlock Issue When Accessing Thread-Safe Objects within Task+withTaskGroup.

Please do look past queue.sync :-)
I am trying to understand what the problem is, so I can be careful.

From my understanding there is no reason for the async barrier block not to run and therefore no reason for the sync blocks to block.

2 Likes

That would be correct if TaskGroup ran all tasks consequently from 0 to 100. However, execution order is not determined and it could easily call read with 100 first.

A subsystem here could become an actor and guarantee you safe access, eliminating the need of the explicit queue and deadlocks

Add a wrapper around it:

actor SubsystemWrapper {

    private let subsystem = Subsystem()

    func performWork(id: Int) {
        subsystem.performWork(id: id)
    }

    func write(id: Int) {
        subsystem.write(id: id)
    }

    func read(id: Int) {
        subsystem.read(id: id)
    }

}

(if you check using wrapper your tests, they pass in 100% cases)

This still doesn't explain the deadlock, which is what @nsc is asking about, and nobody seems to have provided any explanation so far. Although the code is using a known anti-pattern (as explained here), it should be able to make forward progress and not deadlock the concurrency runtime.

3 Likes

The only explanation that seems to make sense is that, for some reason, this dispatch queue does not over-commit the machine (with more threads than cpu cores). Therefore the async barrier is never able to get the new thread that it needs to unblock all the sync calls.

1 Like

This is a legacy subsystem written in Objective-C and somewhere deep down it is using dispatch queues. I was hoping for an answer that would not require me to rewrite a large amount of legacy code.
And it seems to me if that was the answer it would be a problem for many developers, even within Apple itself. But maybe I am missing something.

1 Like

Right, that is what it looks like. I would love to have that reason spelled out explicitly :-)

@nsc, I pasted your code in a project on mine, and ran the test repeatedly (1000 times). It completed without any deadlock.

I can't figure out how your sample code could deadlock. Even if it's not recommended (as seen above), it's not supposed to deadlock. It would be interesting to see the state of all threads during your deadlock (bt all in lldb).

So maybe you did not share your exact setup. Or your sample code does deadlock in a specific but undisclosed configuration (Xcode and OS versions).

In conclusion, a complete stack trace of your deadlock and a reproducing case would greatly help people helping you.

1 Like

For me this deadlocks every single time so I think this is a proper reproducing case.

If I run a detached task before the test() function

...
await Task.detached { print("Hello") }.value
await test()

it seems to complete successfully every run. So the cooperative queue and the concurrent queue seem to interfere somehow.

1 Like

On an M2 MacBook Air that exact code deadlocks pretty much all the time.
On an M1 Studio Ultra it might help to change the condition in performWork to id % 10 == 0 to make it more likely.
I am in both cases on macOS 13.4.1 using Xcode 14.3.1.

I include the backtrace below. It is a bit hard to see, but eight threads (the number of cores on the MacBook Air) are waiting on the cooperative queue Swift is using for its tasks.

(lldb) bt all
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  * frame #0: 0x00000001a820bf14 libsystem_kernel.dylib`mach_msg2_trap + 8
    frame #1: 0x00000001a821e240 libsystem_kernel.dylib`mach_msg2_internal + 80
    frame #2: 0x00000001a8214b78 libsystem_kernel.dylib`mach_msg_overwrite + 604
    frame #3: 0x00000001a820c290 libsystem_kernel.dylib`mach_msg + 24
    frame #4: 0x00000001a832a7e4 CoreFoundation`__CFRunLoopServiceMachPort + 160
    frame #5: 0x00000001a83290c4 CoreFoundation`__CFRunLoopRun + 1208
    frame #6: 0x00000001a83284b8 CoreFoundation`CFRunLoopRunSpecific + 612
    frame #7: 0x00000001007b4bdc XCTestCore`-[XCTWaiter waitForExpectations:timeout:enforceOrder:] + 704
    frame #8: 0x00000001007b6c9c XCTestCore`+[XCTWaiter waitForExpectations:timeout:enforceOrder:] + 80
    frame #9: 0x00000001007d9394 XCTestCore`__81+[XCTFailableInvocation invokeWithAsynchronousWait:lastObservedErrorIssue:block:]_block_invoke + 236
    frame #10: 0x00000001007986d0 XCTestCore`__49+[XCTSwiftErrorObservation observeErrorsInBlock:]_block_invoke + 48
    frame #11: 0x000000010261c0d8 libXCTestSwiftSupport.dylib`function signature specialization <Arg[1] = [Closure Propagated : reabstraction thunk helper from @callee_unowned @convention(block) (@unowned @callee_unowned @convention(block) () -> ()) -> () to @escaping @callee_guaranteed (@unowned @callee_guaranteed () -> ()) -> (), Argument Types : [@callee_unowned @convention(block) (@unowned @callee_unowned @convention(block) () -> ()) -> ()]> of closure #1 () -> () in static __C.XCTSwiftErrorObservation._observeErrors(in: (() -> ()) -> ()) -> () -> Swift.Optional<XCTest.XCTIssue> + 208
    frame #12: 0x000000010261c1d4 libXCTestSwiftSupport.dylib`function signature specialization <Arg[5] = [Closure Propagated : reabstraction thunk helper from @callee_unowned @convention(block) (@unowned @callee_unowned @convention(block) () -> ()) -> () to @escaping @callee_guaranteed (@unowned @callee_guaranteed () -> ()) -> (), Argument Types : [@callee_unowned @convention(block) (@unowned @callee_unowned @convention(block) () -> ()) -> ()]> of function signature specialization <Arg[2] = [Closure Propagated : closure #1 () -> () in static (extension in XCTest):__C.XCTSwiftErrorObservation.(_observeErrors in _B0397D3B80CBC8D7FB9A5B33AB2A74B8)(in: (() -> ()) -> ()) -> () -> Swift.Optional<XCTest.XCTIssue>, Argument Types : [@callee_guaranteed (@unowned @callee_guaranteed () -> ()) -> ()]> of generic specialization <Swift.Optional<XCTest.LocalErrorTracker>, ()> of Swift.TaskLocal.withValue<τ_0_0>(_: τ_0_0, operation: () throws -> τ_1_0, file: Swift.String, line: Swift.UInt) throws -> τ_1_0 + 144
    frame #13: 0x000000010261be80 libXCTestSwiftSupport.dylib`function signature specialization <Arg[0] = [Closure Propagated : reabstraction thunk helper from @callee_unowned @convention(block) (@unowned @callee_unowned @convention(block) () -> ()) -> () to @escaping @callee_guaranteed (@unowned @callee_guaranteed () -> ()) -> (), Argument Types : [@callee_unowned @convention(block) (@unowned @callee_unowned @convention(block) () -> ()) -> ()]> of static __C.XCTSwiftErrorObservation._observeErrors(in: (() -> ()) -> ()) -> () -> Swift.Optional<XCTest.XCTIssue> + 916
    frame #14: 0x000000010261c2bc libXCTestSwiftSupport.dylib`@objc static __C.XCTSwiftErrorObservation._observeErrors(in: (() -> ()) -> ()) -> () -> Swift.Optional<XCTest.XCTIssue> + 52
    frame #15: 0x00000001007985d8 XCTestCore`+[XCTSwiftErrorObservation observeErrorsInBlock:] + 204
    frame #16: 0x00000001007d91cc XCTestCore`+[XCTFailableInvocation invokeWithAsynchronousWait:lastObservedErrorIssue:block:] + 228
    frame #17: 0x00000001007d984c XCTestCore`+[XCTFailableInvocation invokeInvocation:withTestMethodConvention:lastObservedErrorIssue:] + 372
    frame #18: 0x00000001007d9bc8 XCTestCore`+[XCTFailableInvocation invokeInvocation:lastObservedErrorIssue:] + 72
    frame #19: 0x00000001007c7748 XCTestCore`__24-[XCTestCase invokeTest]_block_invoke_2 + 88
    frame #20: 0x00000001007a5924 XCTestCore`-[XCTMemoryChecker _assertInvalidObjectsDeallocatedAfterScope:] + 84
    frame #21: 0x00000001007d0984 XCTestCore`-[XCTestCase assertInvalidObjectsDeallocatedAfterScope:] + 92
    frame #22: 0x00000001007c76c8 XCTestCore`__24-[XCTestCase invokeTest]_block_invoke.98 + 172
    frame #23: 0x00000001007911e8 XCTestCore`-[XCTestCase(XCTIssueHandling) _caughtUnhandledDeveloperExceptionPermittingControlFlowInterruptions:caughtInterruptionException:whileExecutingBlock:] + 168
    frame #24: 0x00000001007c724c XCTestCore`-[XCTestCase invokeTest] + 756
    frame #25: 0x00000001007c889c XCTestCore`__26-[XCTestCase performTest:]_block_invoke.154 + 36
    frame #26: 0x00000001007911e8 XCTestCore`-[XCTestCase(XCTIssueHandling) _caughtUnhandledDeveloperExceptionPermittingControlFlowInterruptions:caughtInterruptionException:whileExecutingBlock:] + 168
    frame #27: 0x00000001007c83e8 XCTestCore`__26-[XCTestCase performTest:]_block_invoke.140 + 516
    frame #28: 0x00000001007aee8c XCTestCore`+[XCTContext _runInChildOfContext:forTestCase:markAsReportingBase:block:] + 180
    frame #29: 0x00000001007aeda0 XCTestCore`+[XCTContext runInContextForTestCase:markAsReportingBase:block:] + 144
    frame #30: 0x00000001007c8040 XCTestCore`-[XCTestCase performTest:] + 308
    frame #31: 0x000000010078002c XCTestCore`-[XCTest runTest] + 48
    frame #32: 0x00000001007b1a18 XCTestCore`-[XCTestSuite runTestBasedOnRepetitionPolicy:testRun:] + 68
    frame #33: 0x00000001007b18f8 XCTestCore`__27-[XCTestSuite performTest:]_block_invoke + 164
    frame #34: 0x00000001007b13f8 XCTestCore`__59-[XCTestSuite _performProtectedSectionForTest:testSection:]_block_invoke + 48
    frame #35: 0x00000001007aee8c XCTestCore`+[XCTContext _runInChildOfContext:forTestCase:markAsReportingBase:block:] + 180
    frame #36: 0x00000001007aeda0 XCTestCore`+[XCTContext runInContextForTestCase:markAsReportingBase:block:] + 144
    frame #37: 0x00000001007b1394 XCTestCore`-[XCTestSuite _performProtectedSectionForTest:testSection:] + 180
    frame #38: 0x00000001007b1600 XCTestCore`-[XCTestSuite performTest:] + 216
    frame #39: 0x000000010078002c XCTestCore`-[XCTest runTest] + 48
    frame #40: 0x0000000100781cf0 XCTestCore`__89-[XCTTestRunSession executeTestsWithIdentifiers:skippingTestsWithIdentifiers:completion:]_block_invoke + 520
    frame #41: 0x00000001007aee8c XCTestCore`+[XCTContext _runInChildOfContext:forTestCase:markAsReportingBase:block:] + 180
    frame #42: 0x00000001007aeda0 XCTestCore`+[XCTContext runInContextForTestCase:markAsReportingBase:block:] + 144
    frame #43: 0x0000000100781a44 XCTestCore`-[XCTTestRunSession executeTestsWithIdentifiers:skippingTestsWithIdentifiers:completion:] + 296
    frame #44: 0x00000001007e6fe0 XCTestCore`__103-[XCTExecutionWorker executeTestIdentifiers:skippingTestIdentifiers:completionHandler:completionQueue:]_block_invoke_2 + 136
    frame #45: 0x00000001007e6500 XCTestCore`-[XCTExecutionWorker runWithError:] + 108
    frame #46: 0x00000001007ac0c8 XCTestCore`__25-[XCTestDriver _runTests]_block_invoke.272 + 56
    frame #47: 0x000000010078a460 XCTestCore`-[XCTestObservationCenter _observeTestExecutionForBlock:] + 288
    frame #48: 0x00000001007abd24 XCTestCore`-[XCTestDriver _runTests] + 1092
    frame #49: 0x000000010078061c XCTestCore`_XCTestMain + 88
    frame #50: 0x00000001000057c0 xctest`main + 156
    frame #51: 0x00000001a7ef3f28 dyld`start + 2236
  thread #2, queue = 'com.apple.root.user-initiated-qos.cooperative'
    frame #0: 0x00000001a820dcd0 libsystem_kernel.dylib`__ulock_wait + 8
    frame #1: 0x00000001a809cdf0 libdispatch.dylib`_dlock_wait + 56
    frame #2: 0x00000001a809cba4 libdispatch.dylib`_dispatch_thread_event_wait_slow + 56
    frame #3: 0x00000001a80abc68 libdispatch.dylib`__DISPATCH_WAIT_FOR_QUEUE__ + 368
    frame #4: 0x00000001a80ab814 libdispatch.dylib`_dispatch_sync_f_slow + 148
    frame #5: 0x0000000100c3c394 DispatchTests`Subsystem.read(id=3, self=0x0000600000204620) at DeadlockTest.swift:32:15
    frame #6: 0x0000000100c3b074 DispatchTests`Subsystem.performWork(id=3, self=0x0000600000204620) at DeadlockTest.swift:21:16
    frame #7: 0x0000000100c3affc DispatchTests`closure #1 in closure #1 in BarrierTests.test(subsystem=0x0000600000204620, index=3) at DeadlockTest.swift:8:43
    frame #8: 0x0000000100c3d960 DispatchTests`partial apply for closure #1 in closure #1 in BarrierTests.test() at <compiler-generated>:0
    frame #9: 0x0000000100c3d190 DispatchTests`thunk for @escaping @callee_guaranteed @Sendable @async () -> (@out A) at <compiler-generated>:0
    frame #10: 0x0000000100c3da9c DispatchTests`partial apply for thunk for @escaping @callee_guaranteed @Sendable @async () -> (@out A) at <compiler-generated>:0
  thread #3, queue = 'com.apple.root.user-initiated-qos.cooperative'
    frame #0: 0x00000001a820dcd0 libsystem_kernel.dylib`__ulock_wait + 8
    frame #1: 0x00000001a809cdf0 libdispatch.dylib`_dlock_wait + 56
    frame #2: 0x00000001a809cba4 libdispatch.dylib`_dispatch_thread_event_wait_slow + 56
    frame #3: 0x00000001a80abc68 libdispatch.dylib`__DISPATCH_WAIT_FOR_QUEUE__ + 368
    frame #4: 0x00000001a80ab814 libdispatch.dylib`_dispatch_sync_f_slow + 148
    frame #5: 0x0000000100c3c394 DispatchTests`Subsystem.read(id=6, self=0x0000600000204620) at DeadlockTest.swift:32:15
    frame #6: 0x0000000100c3b074 DispatchTests`Subsystem.performWork(id=6, self=0x0000600000204620) at DeadlockTest.swift:21:16
    frame #7: 0x0000000100c3affc DispatchTests`closure #1 in closure #1 in BarrierTests.test(subsystem=0x0000600000204620, index=6) at DeadlockTest.swift:8:43
    frame #8: 0x0000000100c3d960 DispatchTests`partial apply for closure #1 in closure #1 in BarrierTests.test() at <compiler-generated>:0
    frame #9: 0x0000000100c3d190 DispatchTests`thunk for @escaping @callee_guaranteed @Sendable @async () -> (@out A) at <compiler-generated>:0
    frame #10: 0x0000000100c3da9c DispatchTests`partial apply for thunk for @escaping @callee_guaranteed @Sendable @async () -> (@out A) at <compiler-generated>:0
  thread #4, queue = 'com.apple.root.user-initiated-qos.cooperative'
    frame #0: 0x00000001a820dcd0 libsystem_kernel.dylib`__ulock_wait + 8
    frame #1: 0x00000001a809cdf0 libdispatch.dylib`_dlock_wait + 56
    frame #2: 0x00000001a809cba4 libdispatch.dylib`_dispatch_thread_event_wait_slow + 56
    frame #3: 0x00000001a80abc68 libdispatch.dylib`__DISPATCH_WAIT_FOR_QUEUE__ + 368
    frame #4: 0x00000001a80ab814 libdispatch.dylib`_dispatch_sync_f_slow + 148
    frame #5: 0x0000000100c3c394 DispatchTests`Subsystem.read(id=9, self=0x0000600000204620) at DeadlockTest.swift:32:15
    frame #6: 0x0000000100c3b074 DispatchTests`Subsystem.performWork(id=9, self=0x0000600000204620) at DeadlockTest.swift:21:16
    frame #7: 0x0000000100c3affc DispatchTests`closure #1 in closure #1 in BarrierTests.test(subsystem=0x0000600000204620, index=9) at DeadlockTest.swift:8:43
    frame #8: 0x0000000100c3d960 DispatchTests`partial apply for closure #1 in closure #1 in BarrierTests.test() at <compiler-generated>:0
    frame #9: 0x0000000100c3d190 DispatchTests`thunk for @escaping @callee_guaranteed @Sendable @async () -> (@out A) at <compiler-generated>:0
    frame #10: 0x0000000100c3da9c DispatchTests`partial apply for thunk for @escaping @callee_guaranteed @Sendable @async () -> (@out A) at <compiler-generated>:0
  thread #5, queue = 'com.apple.root.user-initiated-qos.cooperative'
    frame #0: 0x00000001a820dcd0 libsystem_kernel.dylib`__ulock_wait + 8
    frame #1: 0x00000001a809cdf0 libdispatch.dylib`_dlock_wait + 56
    frame #2: 0x00000001a809cba4 libdispatch.dylib`_dispatch_thread_event_wait_slow + 56
    frame #3: 0x00000001a80abc68 libdispatch.dylib`__DISPATCH_WAIT_FOR_QUEUE__ + 368
    frame #4: 0x00000001a80ab814 libdispatch.dylib`_dispatch_sync_f_slow + 148
    frame #5: 0x0000000100c3c394 DispatchTests`Subsystem.read(id=11, self=0x0000600000204620) at DeadlockTest.swift:32:15
    frame #6: 0x0000000100c3b074 DispatchTests`Subsystem.performWork(id=11, self=0x0000600000204620) at DeadlockTest.swift:21:16
    frame #7: 0x0000000100c3affc DispatchTests`closure #1 in closure #1 in BarrierTests.test(subsystem=0x0000600000204620, index=11) at DeadlockTest.swift:8:43
    frame #8: 0x0000000100c3d960 DispatchTests`partial apply for closure #1 in closure #1 in BarrierTests.test() at <compiler-generated>:0
    frame #9: 0x0000000100c3d190 DispatchTests`thunk for @escaping @callee_guaranteed @Sendable @async () -> (@out A) at <compiler-generated>:0
    frame #10: 0x0000000100c3da9c DispatchTests`partial apply for thunk for @escaping @callee_guaranteed @Sendable @async () -> (@out A) at <compiler-generated>:0
  thread #6, queue = 'com.apple.root.user-initiated-qos.cooperative'
    frame #0: 0x00000001a820dcd0 libsystem_kernel.dylib`__ulock_wait + 8
    frame #1: 0x00000001a809cdf0 libdispatch.dylib`_dlock_wait + 56
    frame #2: 0x00000001a809cba4 libdispatch.dylib`_dispatch_thread_event_wait_slow + 56
    frame #3: 0x00000001a80abc68 libdispatch.dylib`__DISPATCH_WAIT_FOR_QUEUE__ + 368
    frame #4: 0x00000001a80ab814 libdispatch.dylib`_dispatch_sync_f_slow + 148
    frame #5: 0x0000000100c3c394 DispatchTests`Subsystem.read(id=10, self=0x0000600000204620) at DeadlockTest.swift:32:15
    frame #6: 0x0000000100c3b074 DispatchTests`Subsystem.performWork(id=10, self=0x0000600000204620) at DeadlockTest.swift:21:16
    frame #7: 0x0000000100c3affc DispatchTests`closure #1 in closure #1 in BarrierTests.test(subsystem=0x0000600000204620, index=10) at DeadlockTest.swift:8:43
    frame #8: 0x0000000100c3d960 DispatchTests`partial apply for closure #1 in closure #1 in BarrierTests.test() at <compiler-generated>:0
    frame #9: 0x0000000100c3d190 DispatchTests`thunk for @escaping @callee_guaranteed @Sendable @async () -> (@out A) at <compiler-generated>:0
    frame #10: 0x0000000100c3da9c DispatchTests`partial apply for thunk for @escaping @callee_guaranteed @Sendable @async () -> (@out A) at <compiler-generated>:0
  thread #7, queue = 'com.apple.root.user-initiated-qos.cooperative'
    frame #0: 0x00000001a820dcd0 libsystem_kernel.dylib`__ulock_wait + 8
    frame #1: 0x00000001a809cdf0 libdispatch.dylib`_dlock_wait + 56
    frame #2: 0x00000001a809cba4 libdispatch.dylib`_dispatch_thread_event_wait_slow + 56
    frame #3: 0x00000001a80abc68 libdispatch.dylib`__DISPATCH_WAIT_FOR_QUEUE__ + 368
    frame #4: 0x00000001a80ab814 libdispatch.dylib`_dispatch_sync_f_slow + 148
    frame #5: 0x0000000100c3c394 DispatchTests`Subsystem.read(id=4, self=0x0000600000204620) at DeadlockTest.swift:32:15
    frame #6: 0x0000000100c3b074 DispatchTests`Subsystem.performWork(id=4, self=0x0000600000204620) at DeadlockTest.swift:21:16
    frame #7: 0x0000000100c3affc DispatchTests`closure #1 in closure #1 in BarrierTests.test(subsystem=0x0000600000204620, index=4) at DeadlockTest.swift:8:43
    frame #8: 0x0000000100c3d960 DispatchTests`partial apply for closure #1 in closure #1 in BarrierTests.test() at <compiler-generated>:0
    frame #9: 0x0000000100c3d190 DispatchTests`thunk for @escaping @callee_guaranteed @Sendable @async () -> (@out A) at <compiler-generated>:0
    frame #10: 0x0000000100c3da9c DispatchTests`partial apply for thunk for @escaping @callee_guaranteed @Sendable @async () -> (@out A) at <compiler-generated>:0
  thread #8
    frame #0: 0x00000001a820dbc8 libsystem_kernel.dylib`__workq_kernreturn + 8
  thread #9
    frame #0: 0x00000001a820dbc8 libsystem_kernel.dylib`__workq_kernreturn + 8
  thread #10, queue = 'com.apple.root.user-initiated-qos.cooperative'
    frame #0: 0x00000001a820dcd0 libsystem_kernel.dylib`__ulock_wait + 8
    frame #1: 0x00000001a809cdf0 libdispatch.dylib`_dlock_wait + 56
    frame #2: 0x00000001a809cba4 libdispatch.dylib`_dispatch_thread_event_wait_slow + 56
    frame #3: 0x00000001a80abc68 libdispatch.dylib`__DISPATCH_WAIT_FOR_QUEUE__ + 368
    frame #4: 0x00000001a80ab814 libdispatch.dylib`_dispatch_sync_f_slow + 148
    frame #5: 0x0000000100c3c394 DispatchTests`Subsystem.read(id=8, self=0x0000600000204620) at DeadlockTest.swift:32:15
    frame #6: 0x0000000100c3b074 DispatchTests`Subsystem.performWork(id=8, self=0x0000600000204620) at DeadlockTest.swift:21:16
    frame #7: 0x0000000100c3affc DispatchTests`closure #1 in closure #1 in BarrierTests.test(subsystem=0x0000600000204620, index=8) at DeadlockTest.swift:8:43
    frame #8: 0x0000000100c3d960 DispatchTests`partial apply for closure #1 in closure #1 in BarrierTests.test() at <compiler-generated>:0
    frame #9: 0x0000000100c3d190 DispatchTests`thunk for @escaping @callee_guaranteed @Sendable @async () -> (@out A) at <compiler-generated>:0
    frame #10: 0x0000000100c3da9c DispatchTests`partial apply for thunk for @escaping @callee_guaranteed @Sendable @async () -> (@out A) at <compiler-generated>:0
  thread #11, queue = 'com.apple.root.user-initiated-qos.cooperative'
    frame #0: 0x00000001a820dcd0 libsystem_kernel.dylib`__ulock_wait + 8
    frame #1: 0x00000001a809cdf0 libdispatch.dylib`_dlock_wait + 56
    frame #2: 0x00000001a809cba4 libdispatch.dylib`_dispatch_thread_event_wait_slow + 56
    frame #3: 0x00000001a80abc68 libdispatch.dylib`__DISPATCH_WAIT_FOR_QUEUE__ + 368
    frame #4: 0x00000001a80ab814 libdispatch.dylib`_dispatch_sync_f_slow + 148
    frame #5: 0x0000000100c3c394 DispatchTests`Subsystem.read(id=12, self=0x0000600000204620) at DeadlockTest.swift:32:15
    frame #6: 0x0000000100c3b074 DispatchTests`Subsystem.performWork(id=12, self=0x0000600000204620) at DeadlockTest.swift:21:16
    frame #7: 0x0000000100c3affc DispatchTests`closure #1 in closure #1 in BarrierTests.test(subsystem=0x0000600000204620, index=12) at DeadlockTest.swift:8:43
    frame #8: 0x0000000100c3d960 DispatchTests`partial apply for closure #1 in closure #1 in BarrierTests.test() at <compiler-generated>:0
    frame #9: 0x0000000100c3d190 DispatchTests`thunk for @escaping @callee_guaranteed @Sendable @async () -> (@out A) at <compiler-generated>:0
    frame #10: 0x0000000100c3da9c DispatchTests`partial apply for thunk for @escaping @callee_guaranteed @Sendable @async () -> (@out A) at <compiler-generated>:0

For a concurrent queue:

queue.async(flags: .barrier) { print("1") }
queue.async(flags: .barrier) { print("2") }
queue.async(flags: .barrier) { print("3") }

will this always perform in order? I'd say no, but I could be mistaken.

As far as I understand barriers, it would perform in order if the barriers are dispatched from a single thread in that order.
A barrier basically says that everything dispatched before it has executed and nothing dispatched after it has started execution.

FWIW I think the deadlock is even more surprising when you add a sleep before the queue.sync.
The barrier block is still not executed even though it has the concurrent queue completely to itself.

    func read(id: Int) {
        usleep(1_000_000)
        print("schedule read \(id)")
        queue.sync { print(" execute read \(id)") }
        print("schedule read \(id) done")
    }

The default executor for tasks does not overcommit, so if you’re using a system that relies on overcommit for progress, and you cannot rewrite it, then you need to be very careful to only call into it from a thread that is definitely from an overcommiting executor.

1 Like

For clarification: the cooperative queue has all its "slots" filled, but my concurrent queue has only the barrier block dispatched to it and can't make progress anyway. Is a default concurrent queue non-overcommitting in libdispatch? Because I remember having thread explosion easily when dispatching to a concurrent queue.
Also if the concurrent queue is non-overcommitting: does it share the thread limit with the cooperative queue?

1 Like

DispatchQueue is over-committing, but if you’re enqueuing to a queue with queue.sync, you’re also tying up a thread that may be from an executor that is not. sync has the isolation properties of the queue but doesn’t change the underlying reality of what thread you’re on.

1 Like

Ok, but what about the async barrier. It does not block the non-overcommitting queue and doesn't run anyway.

1 Like