Advice on how to debug this crash?

Crashes like this have bubbled up our crash reporting telemetry. It consistently crashes around managing the ownership of DataResponse values. Could someone please suggest something I should look for?

Crashed: com.apple.root.user-initiated-qos.cooperative
0  libobjc.A.dylib                0x2e5c objc_retain_x0 + 16
1  libobjc.A.dylib                0x2e5c objc_retain + 16
2  libswiftCore.dylib             0x41ea34 swift_bridgeObjectRetain + 44
3  App                         0x8002ec outlined retain of DataResponse + 4304962284 (<compiler-generated>:4304962284)
4  App                         0x7fd348 closure #1 in closure #1 in DataService.init(apiClient:dataStorage:legacyIngestionHook:) + 4304950088 (DataService+App.swift:4304950088)
5  libswift_Concurrency.dylib     0x40f6c swift::runJobInEstablishedExecutorContext(swift::Job*) + 420
6  libswift_Concurrency.dylib     0x41e78 swift_job_runImpl(swift::Job*, swift::ExecutorRef) + 72
7  libdispatch.dylib              0x15a6c _dispatch_root_queue_drain + 396
8  libdispatch.dylib              0x16284 _dispatch_worker_thread2 + 164
9  libsystem_pthread.dylib        0xdbc _pthread_wqthread + 228
10 libsystem_pthread.dylib        0xb98 start_wqthread + 8

What follows is a high level explanation of what happens in the closures in frame 4. The responsibility of the DataService type is to encapsulate the dependencies or side effects the feature has. The implementation of those behaviours is passed in at initialisation (for example, initialising a "live" data service would pass in closures that use URLSession, a "test" data service would pass in closures that return mock data).

public struct DataService {
    
    public init(
        getDataPublisherBuilder: @escaping (Date, Calendar) -> AnyPublisher<ReturnValue, Error>
    ) {
        self.getDataPublisherBuilder = getDataPublisherBuilder
    }
}

// e.g.:
extension DataService {
   static var live: Self {
      self.init({ ... })
   }
}

Notably, the live instance of the DataService bridges between some Combine and Swift Concurrency code using this type:

public struct SendablePublisher<Output, Failure: Error>: Publisher {
    let upstream: AnyPublisher<Output, Failure>
    
    public init(
        fullFill: @Sendable @escaping () async throws -> Output
    ) where Failure == Error {
        var task: Task<Void, Never>?
        upstream = Deferred {
            Future { promise in
                task = Task {
                    do {
                        let result = try await fullFill()
                        try Task.checkCancellation()
                        promise(.success(result))
                    } catch {
                        promise(.failure(error))
                    }
                }
            }
        }
        .handleEvents(receiveCancel: { task?.cancel() })
        .eraseToAnyPublisher()
    }
    
    public func receive<S>(subscriber: S) where S : Subscriber, Failure == S.Failure, Output == S.Input {
        upstream.subscribe(subscriber)
    }
}

The code within the sendable publisher has multiple asynchronous responsibilities. First, it makes the network request. Second, it computes a DataResponse value. Then, it awaits the completion of a method isolated on the main actor that also claims ownership of the freshly computed DataResponse value. Finally, it does some fairly innocent computations to determine the return value:

SendablePublisher { () -> ReturnValue in
   // prepare the network request body and await the response
   let data = try await apiClient.request(body)
   // decode an instance of `DataResponse` 
   let response = try JSONDecoder().decode(DataResponse.self, from: data)
   // send to reference type
   await dataStorage.didReceiveDataResponse(response)
   // other innocent processing occurs here to calculate the return value
   return innocentReturnValue
}

The DataStorage type is a class:

final class DataStorage {
    
    private(set) var lastResponse: DataResponse?
    
    @MainActor func didReceiveDataResponse(_ response: DataResponse) {
        self.lastResponse = response
    }
}

Even though DataResponse is a struct, it has properties that are arrays which I understand results in the reference counting.

88% of the crashes occur on iOS 16. The rest are on iOS 15.

I would appreciate any suggestions. Thanks for your time reading this far!

My current suspicion is there is a race condition around the ownership management of values for the lastResponse property on the DataStorage type (some thread recently read the property while around the same time another thread is writing to it).

You can verify this hypothesis:

    private var _realLastResponse: DataResponse? // don't use directly

    var lastResponse: DataResponse? {
        get {
            dispatchPrecondition(.onQueue(.main)) // or whatever the queue should be
            return _realLastResponse
        }
        set {
            dispatchPrecondition(.onQueue(.main)) // or whatever the queue should be
            _realLastResponse = newValue
        }
    }
1 Like