I was playing with measuring Actor performance

I have written this tiny test to compare actor-based synchronization with raw lock-based synchronization.

I have written these 2 similar objects:


final class StateHolderLock {
    var lock = os_unfair_lock_s()

    init() {}

    var sum = 0

    var onNewValueReceived: ((Int) -> Void)!

    func handleValueRecieved(_ val: Int) {
        os_unfair_lock_lock(&lock)
        sum += val
        os_unfair_lock_unlock(&lock)
        onNewValueReceived(val)
    }
}

final actor StateHolderActor {
    init() {}

    var sum = 0

    nonisolated(unsafe) var onNewValueReceived: ((Int) -> Void)!

    func handleValueRecieved(_ val: Int) {
        sum += val
        onNewValueReceived(val)
    }
}

And I have written a usage case, like this, which I run in a macOS CLI app target, in Release mode:

let iterations = 1000000
func testStateHolderActor() async {
    let e = measureSH()
    let actor = StateHolderActor()

    var sum = 0
    actor.onNewValueReceived = { val in
        sum += val
    }

    for i in 0 ..< iterations {
        await actor.handleValueRecieved(1)
    }

    e()
}

func testStateHolderLocked() {
    let e = measureSH()
    let actor = StateHolderLock()

    var sum = 0
    actor.onNewValueReceived = { val in
        sum += val
    }

    for i in 0 ..< iterations {
        actor.handleValueRecieved(1)
    }

    e()
}

Unfortunately, I am seeing that the lock version takes 2 ms to complete, while the actor-based version takes 58 ms to complete.

And I see in the trace of Time Profiler that testStateHolderActor() has a weight of 8.9%, while testStateHolderLocked() has a weight of 1.3%...

The code does a lot of stuff that's not directly related to what I have written.

Can somebody help me, maybe I need to somehow setup my project so that the performance is comparable?

I am attaching a trace file too. ohh well... I can email it to you, I can't just upload a zip here...

2 Likes

This is expected to an extent: actors are much higher-level than simple locks and rely on Swift Concurrency runtime to work, which includes some scheduling logic on top of OS threads. In that, they are more comparable to GCD queues.

You should not really expect actors to demonstrate comparable performance in tasks such as simply incrementing an integer from a different isolation domain, as the runtime will have to constantly switch contexts, which is way more expensive than the operation itself.

Part of the problem is that testStateHolderActor() runs on the so-called generic executor, while the actor has its own executor, and the runtime has to switch in each iteration. You can modify your logic to only hop off the actor once:

func testStateHolderActor() async {
    let e = measureSH()
    let actor = StateHolderActor()

    var sum = 0
    actor.onNewValueReceived = { val in
        sum += val
    }

    func run(actor: isolated StateHolderActor) async {
         for i in 0 ..< iterations {
            actor.handleValueRecieved(1)
        }
    }

    await run(actor: actor)

    e()
}

You will find much more success with actors for tasks where you have to isolate larger stateful systems, where it becomes increasingly more cumbersome to set up locking properly and there's a need to support async operations by design, such as network I/O.

8 Likes

wow, thank you!

It helped, now the actor version takes 6 ms versus 2 ms for the locked.

3 Likes

forgive me for the tangent, but i wanted to highlight that the pattern used here is not safe, since using an os_unfair_lock_s in this manner is not guaranteed to have a stable address. this pitfall is highlighted in the documentation for OSAllocatedUnfairLock, which should be used instead (if available for your platform). it states:

However, it’s unsafe to use os_unfair_lock from Swift because it’s a value type and, therefore, doesn’t have a stable memory address. That means when you call os_unfair_lock_lock or os_unfair_lock_unlock and pass a lock object using the & operator, the system may lock or unlock the wrong object.

3 Likes

Use Mutex instead of OSAllocatedUnfairLock when possible :slightly_smiling_face:

4 Likes

I'll just add that this is, of course, not an equivalent transformation: what happens in the new version is as if you'd called os_unfair_lock_lock and os_unfair_lock_unlock outside the loop, so the two cases are not strictly comparable anymore.

1 Like

Doesn't it have a stable memory address because it's allocated as a stored property of a reference-type object?

1 Like

ohhh thanks for bringing my attention to this <3

Yes, but taking the address of a class property 1. does not guarantee that the pointer you get is the actual pointer to the property (we can make a temporary one if we want!) 2. introduces runtime exclusivity checks which synchronization primitives generally want to avoid because they implement their synchronization.

3 Likes

It introduces runtime exclusivity checks? And OSAllocatedUnfairLockdoesn't?

Mutex and OSAllocatedUnfairLock do not introduce exclusivity checks.

1 Like

ASDF measure SH testStateHolderActor() 7866 microseconds
ASDF measure SH testStateHolderLocked() 318 microseconds

oh no...

An actor is not a lock, you shouldn't try to use it as such, see this post from ktoso: Why do not actor-isolated properties support 'await' setter? - #33 by ktoso

4 Likes

in addition to what @Alejandro pointed out, this past thread has more discussion on the matter.

I have tried the above code, measuring the time with this:


func measure (_ prefix: String, _ f: () -> Void) {
    let d = ContinuousClock ().measure {
        f ()
    }
    print (prefix, d)
}

func measure (_ prefix: String, _ f: () async -> Void) async {
    let d = await ContinuousClock ().measure {
        await f ()
    }
    print (prefix, d)
}
Details
import Foundation

@main
enum Test {
    static func main () async  {
        testStateHolderLocked()
        await testStateHolderActor()
        await testStateHolderActor2()
    }
}

// [https://forums.swift.org/t/i-was-playing-with-measuring-actor-performance/75005]

final class StateHolderLock {
    var lock = os_unfair_lock_s()

    init() {}

    var sum = 0

    var onNewValueReceived: ((Int) -> Void)!

    func handleValueRecieved(_ val: Int) {
        os_unfair_lock_lock(&lock)
        sum += val
        os_unfair_lock_unlock(&lock)
        onNewValueReceived(val)
    }
}

final actor StateHolderActor {
    init() {}

    var sum = 0

    nonisolated(unsafe) var onNewValueReceived: ((Int) -> Void)!

    func handleValueRecieved(_ val: Int) {
        sum += val
        onNewValueReceived(val)
    }
}

let iterations = 1000000
func testStateHolderActor() async {
    await measure ("Actor:") {
        let actor = StateHolderActor()
        
        var sum = 0
        actor.onNewValueReceived = { val in
            sum += val
        }
        
        for _ in 0 ..< iterations {
            await actor.handleValueRecieved(1)
        }
        
    }
}

func testStateHolderLocked() {
    measure ("Locked:") {
        let actor = StateHolderLock()
        
        var sum = 0
        actor.onNewValueReceived = { val in
            sum += val
        }
        
        for _ in 0 ..< iterations {
            actor.handleValueRecieved(1)
        }
        
    }
}

// [https://forums.swift.org/t/i-was-playing-with-measuring-actor-performance/75005/2]

func testStateHolderActor2 () async {
    await measure ("Actor2 :") {
        let actor = StateHolderActor()
        
        var sum = 0
        actor.onNewValueReceived = { val in
            sum += val
        }
        
        func run (actor: isolated StateHolderActor) async {
            for _ in 0 ..< iterations {
                actor.handleValueRecieved(1)
            }
        }
        
        await run (actor: actor)
    }
}

func measure (_ prefix: String, _ f: () -> Void) {
    let d = ContinuousClock ().measure {
        f ()
    }
    print (prefix, d)
}

func measure (_ prefix: String, _ f: () async -> Void) async {
    let d = await ContinuousClock ().measure {
        await f ()
    }
    print (prefix, d)
}

And this is what I got (on macOS 14.5, 3.2 GHz 6-Core Intel Core i7, Xcode Version 15.4 (15F31d)

Locked: 0.536446558 seconds
Actor: 0.65408858 seconds
Actor2 : 0.525278467 seconds

1 Like

Another interestingly subtle thing to note here is that if the main actor is involved in your test at all (whether going to it or from it), that can change the performance characteristics significantly.

The reason for this is an optimization called "executor stealing". When switching between actors that use the cooperative thread pool [1], Swift can completely avoid the cost of actually switching threads by reusing the current thread for the actor it's switching to.

For the main actor, an actual thread switch has to happen since the main actor is required to run on the main thread and everything else is required not to.


  1. which is all non-main actors unless overridden by a custom executor ↩︎

6 Likes

As of topic of actors performance, there was a thread replicating Go’s channels and out of curiosity I’ve made actors implementation to compare performances: Async Channels for Swift concurrency - #44 by vns

In general actors are quite good from performance perspective: there are cases where strategically put locks might perform better, especially if locks allow significantly reduce number of hops between executors (you can see in the thread that syncRw version is the slowest in either implementation exactly because there are a lot of hops back and forth), but for a majority of use cases they are pretty performant on their own.

So that if you have small chunks of work between which you expect to switch extensively, I’d prefer lock over actor — in that case you have zero hops between executors.

Also, I want to highlight the fact that your code with locks here is completely synchronous, while actors version is by nature introduce some asynchronous work. I think to be completely fair in comparison, you would need to introduce offloading to a separate queue for lock version and probably introduce some level of parallelism — so that your code is actually being mutated from different threads (for both locks and actors versions).

2 Likes