Intermittent crash in concurrency runtime with message: mutex lock failed: Invalid argument

As of now in the concurrency runtime there is no easy way to manage top level tasks. So I have created CancellationSource to manage cancellation of top-level tasks.

The working principle for this type is pretty simple:

  1. Create a task group to add submitted tasks for cancellation and create a continuation to control the lifetime/cancellation of the group.
  2. Use withTaskCancellationHandler to register submitted tasks for cancellation.
  3. Allow linking CancellationSources with other CancellationSources by linking the cancellation of their task groups.
  4. To provide synchronous initialization method, the task group and continuation property in initialized as part of a top level task.

When I am running the CancellationSourceTests test case I am randomly getting crash on concurrency runtime:

* thread #4, queue = 'com.apple.root.user-initiated-qos.cooperative', stop reason = signal SIGABRT
  * frame #0: 0x0000000193f221b0 libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x0000000193f58cec libsystem_pthread.dylib`pthread_kill + 288
    frame #2: 0x0000000193e922c8 libsystem_c.dylib`abort + 180
    frame #3: 0x0000000193f12b18 libc++abi.dylib`abort_message + 132
    frame #4: 0x0000000193f029f4 libc++abi.dylib`demangling_terminate_handler() + 312
    frame #5: 0x0000000193c07774 libobjc.A.dylib`_objc_terminate() + 160
    frame #6: 0x0000000193f11eb4 libc++abi.dylib`std::__terminate(void (*)()) + 20
    frame #7: 0x0000000193f14c2c libc++abi.dylib`__cxxabiv1::failed_throw(__cxxabiv1::__cxa_exception*) + 36
    frame #8: 0x0000000193f14bd8 libc++abi.dylib`__cxa_throw + 140
    frame #9: 0x0000000193eb31f4 libc++.1.dylib`std::__1::__throw_system_error(int, char const*) + 100
    frame #10: 0x0000000193ea8a00 libc++.1.dylib`std::__1::mutex::lock() + 40
    frame #11: 0x000000021d67b02c libswift_Concurrency.dylib`swift::TaskGroup::offer(swift::AsyncTask*, swift::AsyncContext*) + 60
    frame #12: 0x000000021d676120 libswift_Concurrency.dylib`swift::AsyncTask::completeFuture(swift::AsyncContext*) + 136
    frame #13: 0x000000021d6789b0 libswift_Concurrency.dylib`completeTaskWithClosure(swift::AsyncContext*, swift::SwiftError*) + 252
0   ???                                 0x00000001007a872c 0x0 + 4302997292,
1   xctest                              0x0000000100005548 main + 0,
2   libsystem_c.dylib                   0x0000000193e922c8 abort + 180,
3   libc++abi.dylib                     0x0000000193f12b18 _ZN10__cxxabiv130__aligned_malloc_with_fallbackEm + 0,
4   libc++abi.dylib                     0x0000000193f029f4 _ZL28demangling_terminate_handlerv + 312,
5   libobjc.A.dylib                     0x0000000193c07774 _ZL15_objc_terminatev + 160,
6   libc++abi.dylib                     0x0000000193f11eb4 _ZSt11__terminatePFvvE + 20,
7   libc++abi.dylib                     0x0000000193f14c2c __cxa_get_exception_ptr + 0,
8   libc++abi.dylib                     0x0000000193f14bd8 _ZN10__cxxabiv1L22exception_cleanup_funcE19_Unwind_Reason_CodeP17_Unwind_Exception + 0,
9   libc++.1.dylib                      0x0000000193eb31f4 _ZNSt3__120__throw_system_errorEiPKc + 100,
10  libc++.1.dylib                      0x0000000193ea8a00 _ZNSt3__15mutex8try_lockEv + 0,
11  libswift_Concurrency.dylib          0x000000021d67b02c _ZN5swift9TaskGroup5offerEPNS_9AsyncTaskEPNS_12AsyncContextE + 60,
12  libswift_Concurrency.dylib          0x000000021d676120 _ZN5swift9AsyncTask14completeFutureEPNS_12AsyncContextE + 136,
13  libswift_Concurrency.dylib          0x000000021d6789b0 _ZL23completeTaskWithClosurePN5swift12AsyncContextEPNS_10SwiftErrorE + 252,
14  libswift_Concurrency.dylib          0x000000021d6714c4 _ZN5swift34runJobInEstablishedExecutorContextEPNS_3JobE + 376,
15  libswift_Concurrency.dylib          0x000000021d672408 _ZL17swift_job_runImplPN5swift3JobENS_11ExecutorRefE + 72,
16  libdispatch.dylib                   0x0000000193de3f94 _dispatch_root_queue_drain + 396,
17  libdispatch.dylib                   0x0000000193de47c0 _dispatch_worker_thread2 + 164,
18  libsystem_pthread.dylib             0x0000000193f550c4 _pthread_wqthread + 228,
19  libsystem_pthread.dylib             0x0000000193f53e20 start_wqthread + 8

This crash is quite random and seems to happen between test methods testTaskCancellationWithLinkedSource and testTaskCancellationWithMultipleLinkedSources:

libc++abi: terminating with uncaught exception of type std::__1::system_error: mutex lock failed: Invalid argument

Since the mutation is only happening once(no concurrent mutation), and all the access to data is safeguarded with the completion of initialization task, I am quite puzzled what is the reason of crash here. I would appreciate any help I can get.

1 Like