Simple state protection via Actor vs DispatchQueue.sync?

Actors are great for protecting state, but (generally) require the caller to await, which may require them to spawn a task to call the actor code in question. Sometimes this seems like overkill, and I think "a dispatch-queue/lock to just protect this one small code section." Here is a very specfic example:

class MyDocument {
    var queryTask: Task<Void,Never>?
    func done() {
        queryTask = nil

    func launchQuery() {
        if queryTask == nil {
            queryTask = Task<Void,Never>.detached {
                ... // run some code

If launchQuery() can be called from anywhere, then accessing queryTask needs protection.

What do people feel the best approach is these days? If I did make MyDocument an actor, then a lot of code would be:

    _ = Task {
        await myDocument.launchQuery()

which seems silly, though I suppose I could make this part of the MyDocument API (a nonisolated function).

Thoughts? Are tasks so lightweight I just shouldn't worry, and use actor is my state protection mechanism, in favor of DispatchQueue.sync (or even NSLock, which I've actually never employed)?

Ultimately it comes down to the external API you want or need to present. If your external API is already async in some way, using an actor can make a lot of sense and is easy to integrate. However, I've found that the more sync API you need to expose, the hard it is to justify an actor, as you need to break out of the isolation more and more, thereby reducing the value of the actor in the first place.

As an aside, DispatchQueue.sync is essentially the slowest lock-like mechanism on the platform so it really shouldn't be used unless there's other Dispatch usage you're interoperating with. You can start using OSAllocatedUnfairLock if your OS support is late enough, or create an interface on top of it and NSLock to do what you need with a scoped API. And you really never need to worry about creating a single Task, it's many Tasks that may be an issue.

1 Like

My platforms are iPadOS, macOS, and tvOS, but reasonably “modern” (i.e. I only need to support macOS/iPadOS/tvOS as of, say, 1-2 years ago.)

So yes, the external API I want to present is much more of a sync API. Would you favor OSAllocatedUnfairLock, or just adding a routine inside my API that spawns one extra task so it can await? If there is not much performance/cost difference, I think I would favor doing it purely through “Task” rather than mixing constructs — but only if the performance/memory hit is comparable.

It really depends on what you're doing. For a mostly sync API then yes, I'd just put your mutable state in a lock and then throw async work off with a Task. For instance, I recently wrapped an image caching system to ensure there are no duplicate downloads. It's an async API but it needs to synchronously update state so that other callers are properly blocked or receive the downloaded image, so it combines locks and async work.

public func image(from url: URL) async throws -> UIImage {
    // print("Image:", #function)
    enum Output {
        case image(UIImage)
        case task(DataTask<UIImage>)

    let output: Output = inFlightDownloads.withLock { downloads in
        if let image = self.cachedImage(at: url) {
            // print("Image: cached image found for \(url)")
            return .image(image)
        } else {
            if let task = downloads[url] {
                // print("Image: existing download task found for \(url)")
                return .task(task)
            } else {
                let task = imageDownloader.imageDownloadTask(from: url)
                // print("Image: starting download of \(url)")
                downloads[url] = task
                return .task(task)

    switch output {
    case let .image(image):
        // print("Image: got found image from output.")
        return image
    case let .task(dataTask):
        let image = try await dataTask.value
        // print("Image: awaited dataTask from output.")
        cacheImageIfNecessary(image, from: url)

        return image

So in this case I can lock the critical state, then either return synchronously or await outside the lock to cache and return later (where cacheImageIfNecessary also takes the lock to ensure only one call tries to cache, since it reads and writes to disk within the lock).

I also needed a sync method to start a prefetch of an image I don't care about immediately, so it uses a similar approach but throws the await before caching out into a Task.

public func prefetchImage(from url: URL) {
    let task: DataTask<UIImage>? = inFlightDownloads.withLock { downloads in
        // If the image is already cached, no need to prefetch.
        guard cacheManager.object(forKey: url.absoluteString) == nil else { return nil }

        // If a download is in progress, no need to start another.
        guard downloads[url] == nil else { return nil }

        let task = imageDownloader.imageDownloadTask(from: url)
        // print("Image: starting prefetch download of \(url)")
        downloads[url] = task

        return task

    guard let task else { return }

    Task {
        let image = try await task.value
        cacheImageIfNecessary(image, from: url)

So in reality you'll probably need to combine approaches to get the API you really want.

Someone did some benchmarking here: Performance: Actor vs queue vs lock , but it seems the unfair lock was not used correctly in that article

What was wrong with their use of the unfair lock? I looked but didn’t catch it.

I’m surprised that Actor is 10x slower than DispatchQueue, frankly. But I am excited that the unfair lock appears to be so fast, given the newer closure based syntax (if the speed holds up), because protecting a single variable or 1-2 lines of code this way is very easy to reason about when you know you can’t deadlock.

I’ll run my own benchmarks though to see using the new closure syntax.

Not terribly surprising, really. The actor version is asynchronous, so it’s doing much more work than the queue version. Last time I measured, actor calls were roughly in the middle between DispatchQueue.sync and async, but I was only testing one specific scenario and it was about a year ago, so I wouldn’t extrapolate too much from that.

My preiminary numbers show DispatchQueue much slower than actor, actually. Unfair locks beat actors by about 8x. I’ll run some more numbers. I’m only testing uncontended access which is actually the common case (for me).

On macOS, with two detached tasks trying to increment a trivial counter (i.e. contention), the relative timing is:

   unfair locks:     1.0
   actors:           6.66
   dispatch queue:   16.2

without contention:

   unfair locks:       1.0
   actors:             5.125
   dispatch queue:     15.75

So, actually, same relative performance more or less. On iPadOS (with an iPadPro) the uncontended case is the same relative speed; however, in the contended case, dispatchQueue is an additional 2x (so a total of 32x or so) worse than unfair locks.

Update: if you use unfairLock.withLockUnchecked { } then the relative timings become 8x and 25x faster, respectively.

Is that DispatchQueue.sync or DispatchQueue.async? They should be very very different

DispatchQueue.sync { }

We're trying to block until the operation is complete, as unfair locks do.

Interesting. Good showing from the actor implementation there then!

Actually, you can't "just spawn an extra task so you can await" as I suggested earlier, since the whole point is you're non-async!

So in fact, if you want to live in a non-async world, your choice is DispatchQueue.sync { } or unfair allocated locks, and I know now which I obviously prefer.

I'm starting to feel a little bad for the people who came up with Actors. Yes, it's a great concept, and in a world where you're starting from scratch and can carefully annotation @MainActor from the beginning, the compile-time safety is going to be a great, great, great thing.

But now the band-wagon and over-hype is "actors solve all your locking issues!" which is obviously not true. It's a great tool in the right place, but there's no free lunch anywhere. I'm sad Actor is getting over-hyped and then criticized unfairly for situations where it can't (and shouldn't) be the right solution...


Unfortunately the internet is littered with this mistake and it keeps coming up, I wish this was recognized by the compiler and produced an error. Basically the ampersand is not an address-of operator like it is in C and it makes this code unreliable, you can learn more about this here, here and here. The solution is to use an unsafe pointer to the unfair lock or better yet OSAllocatedUnfairLock.

Amen to that. I think fewer larger actors make much more sense than many small actors. Few large bubbles of concurrency with synchronous code inside each bubble. I'm sure a native lock is eventually coming to the standard library (or Swift Concurrency), it's important for cross-platform Swift but I can't wait for it because this would also be an important message to developers that, as you say, actors are not this "solve all your locking issues" thing.

1 Like

The code used in the linked benchmark is so inefficient that I would ignore it completely. Every locking method is abstracted into an escaping async closure that is called in a loop. So even the os_unfair_lock is subject to async/await calling conventions.

Additionally, the code runs via a Task.init from SwiftUI code - so the task inherits MainActor isolation. This means the task always suspends when calling into the actor used for the actor benchmark. This is the worst case scenario for an actor - and not representative of Swift Concurrency used correctly.

Here are my own benchmark results, from memory:

(uncontended, lock & unlock)

x86_64 actor: ~70ns
arm64e actor: ~150ns

x86_64 os_unfair_lock: <10ns
arm64e os_unfair_lock: <5ns

os_unfair_lock is impressively fast, and Swift actors are mysteriously slow. Swift actors, in the fast path, perform relaxed cmpxchg atomics on a 16 byte state. When performing the same instructions directly via swift-atomics, I measure x86_64 at ~15ns and arm64e at ~10ns.