Swift 6 inter-actor or actor-to-class communication

dsharma · January 10, 2025, 10:48am

The code below models two camera tasks - configuring Capture Session and receiving/processing frames from camera in real time using Swift concurrency. Both the features are implemented as actor and they might sometimes talk to one another. This is where I have few doubts:

The function configureSession fails to build as I access captureManager.videoOutput in actor CaptureSession. I could fix it by making the function configureSession as async and wrap all the code in Task closure. But then as the code expands I will need to take steps to prevent reentrancy. Is that the only way two two actors can talk or there is a better way out here?
There is still a warning Sending 'sampleBuffer' risks causing data races. Task-isolated 'sampleBuffer' is captured by a actor-isolated closure. actor-isolated uses in closure may race against later nonisolated uses in the delegate callback method. Do I need to forcefully wrap it in a @Unchecked sendable struct or there is a better way out there?

@preconcurrency import AVFoundation

actor CaptureManager: NSObject, AVCaptureVideoDataOutputSampleBufferDelegate {
    
    let captureQueue = DispatchSerialQueue(label: "Output Queue")
    let videoOutput = AVCaptureVideoDataOutput()
    
    // Sets the session queue as the actor's executor.
    nonisolated var unownedExecutor: UnownedSerialExecutor {
        captureQueue.asUnownedSerialExecutor()
    }
    
    override init() {
        super.init()
        
        videoOutput.setSampleBufferDelegate(self, queue: captureQueue) // Set the delegate to receive video frames from camera
    }
    
    //Delegate method for receiving video frames
    nonisolated func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
        self.assumeIsolated { manager in
            if let videoDataOutput = output as? AVCaptureVideoDataOutput {
                manager.processVideoSampleBuffer(sampleBuffer, fromOutput: videoDataOutput)
            }
        }
    }
    
    func processVideoSampleBuffer(_ sampleBuffer:CMSampleBuffer, fromOutput: AVCaptureVideoDataOutput) {
        //Do the processing
    }
}

actor CaptureSession {
    private let session = AVCaptureSession()
    private let captureManager = CaptureManager()
    
    init() {
        Task {
            await configureSession()
        }
    }
    
    func configureSession() {
        let videoOutput = captureManager.videoOutput //Build fails here
        session.addOutput(videoOutput)
        
    }
}

vns · January 10, 2025, 3:10pm

You can introduce a global actor and isolate both current actors on one global actor:

@globalActor 
actor CameraActor: GlobalActor {
    static let shared = CameraActor()
}

@CameraActor
final class CaptureManager {
}

@CameraActor
final class CaptureSession {
}

Then they form a subsystem for camera operations isolated in one concurrency domain.

dsharma · January 10, 2025, 5:23pm

That is one option but I don't want to mix session configuration/updation methods(CaptureSession) with AVCaptureVideo/AudioDataOutput delegates(CaptureManager). Apple sample codes also follow the same but I am not too sure if it is reasonable to have just one queue for all.

Secondly, can we have custom executor for the global actor?

vns · January 10, 2025, 6:22pm

Why so? You clearly want do some related processing along with getting frames, so why not keep them together? Overhead of passing across the isolation domain is probably much larger than any possible work.

If you want to pass anyway due to some reason (there are valid cases, but your real-time processing more likely not one of them from my experience), then you need architect in terms of messaging via some sendable type between two actors, not directly accessing properties (which in either actors model or OOP is a sign that this is probably part of one system). Having global actor is the most natural way to express such belonging.

Yes, you can set global actor to a custom executor, which for this case would be session delegate queue I suppose, and have it work in that way. I would suggest this approach a lot.

dsharma · January 10, 2025, 6:51pm

"Why so" is interesting and I have asked the same question to DTS. In the early sample codes by Apple (pre 2014), they used to have one queue for session/device configuration, one queue for video frames delivery, one queue for audio frames delivery, and one queue for asset writer based video recording. Post 2019, AVMultiCamPIP sample code used one queue for session configuration and other for samples delivery from multiple cameras, microphone, as well as Asset Writer. A single queue for everything seems tempting but will require exhaustive testing in all video formats (resolutions, fps, HDR, etc.) and all iOS device models. I am wondering if apart from AVCaptureSession.startRunning() call, there are any other blocking calls.

dsharma · January 10, 2025, 7:07pm

I tried assumeIsolated call in the video capture output delegate but it is not available. There are other issues as well with this approach, such as MainActor Isolated Observable Model which keeps a reference to CameraManager. Build fails there as well.

CameraActor.assumeIsolated { manager in


}

SlugFiller · January 12, 2025, 8:25pm

Is it necessary for videoOutput to be a member of CaptureManager? It's not used aside from setSampleBufferDelegate. This, in turn, gives it access to self and captureQueue, both Sendable. This means if videoOutput is initialized locally via a method or property, instead of being a member, it can be returned as sending.

In captureOutput, it actually receives a reference to videoOutput in the output parameter, so there's no need to hold the reference as a member. This is somewhat cheating: The compiler can't tell that this is happening, so it would still think that videoOutput is isolated to CaptureSession. However, this is simply a side effect of the parameter to AVCaptureSession.addOutput not being marked as sending when it should be, if the library conformed to Swift 6 concurrency (Since it is sending it to the associated dispatch queue).

vns · January 12, 2025, 8:46pm

A lot of things have changed since 2014, so it’s natural for the approach to evolve. Such concurrent design might have justifications, but the everything boils down to the details of the processing.

I would say that at the bare minimum you might need to process frames in separate isolation. This works pretty good from my experience running some heaving processing on it.

Contrary, over-engineered solution with several actors ended up performing worse for me since it has created a lot of friction and unnecessary work. I’ve ended up isolating camera on a global actor and get a magnitude better performance for this part.

Not because actors were bad, don’t get me wrong - just my design was bad as initially I thought I need to split all this work by different isolations not to block anything. Funnily enough, it has worked “too good” as now device was just constantly performing work without a break, because apart from required work it also has been doing a lot of expensive administration.

I have hard time understand what you mean by many things, I saw parallel topic on assumeIsolated you’ve created — that’s one of the downsides currently as all nice features limited to the runtime.

As for main actor, the radical way is to make all main actor isolated - it works well for simple cases with camera (I think I even saw Apple’s example once doing so). Alternatively, and probably better in long term, is to avoid UI state there - it’s just not the concern of a camera manager, as you are doing a lot of work on its own queue, and UI lives kinda separately.

dsharma · January 13, 2025, 12:43pm

So I have created a prototype where model & UI lives on main thread, but the Camera session configuration & video/audio frames processing lives on Global Actor. Video frames are processed on the Global actor and then handed over to recorder (which also runs on Global Actor), and also yielded to preview view in UI on the main actor. The downside is I don't know how much processing is "heavy processing", and in case processing becomes heavy, how will parallelism/isolation help (because processing needs to keep up with frame rate)?