Actor behavior seems to differ when compiling with Catalyst

I have a situation where a function (let's say foo()) is calling into an actor. In a test we try to verify behavior of this function foo(). To do so, in the test code we call into the same instance of the actor.

We use a Task.yield() to help things execute in the correct order. I understand that this isn't guaranteed or deterministic, and that's fine. Running this test on the iOS simulator never seems to fail.

However, when building for Mac Catalyst and running this unit test, it hardly ever passes. I put a print statement in foo() right before it calls into the actor. And I put a print statement in my test before it calls into the actor:

func foo() async {
   print("in foo")
   await actor.bar()
}
...
...

func test() {
  ...
  ...
  print("in test")
  let result = await actor.baz()
  XCTAssertTrue(result)
}

Even when the test fails, the print statements in foo execute before the test code print statement:

in foo
in test

This should show that foo is trying to access the actor before the test does.

However, the result is not expected, showing that the call to actor.bar() executed after actor.baz().

It seems to me that the order of calling into an actor should be preserved. Is it not? Is this a bug with Mac Catalyst?

What macOS host and iOS version are you comparing?

Good question.

macOS 12.5.1, iOS 16 (via Xcode 14 beta 6), and iOS 15.5 (via Xcode 13.4)

This may be a difference between the sim, which may still use a thread pool of size 1, and macOS, which uses a thread pool the size of your logical processor count. So on iOS the order of operations are effectively serialized, where they won't be in real life. In general, you can't rely on the order of two separate suspendible operations on an actor, as they could run in parallel. There is no first in, first out guarantee like a serial DispatchQueue.

Can you please clarify this? I thought isolated functions on an actor could not run concurrently.

So Actors do no not preserve the order in which an isolated method is called? ie, if there are multiple callers to an isolated method, they can run in any order, it doesn't matter which one tried first?

As I understand it, not only can the methods run concurrently, but they can run reentrantly as well. Actors only guarantee the protection of mutable state, not the order of operations.

Yep, any order, which was a great surprise to many of us when we tried to replace our previous solutions.

Personally I find actors pretty useless right now. The one big advantage is that the compiler checks their safety and enforces safe access on all consumers. But I have yet to find it appropriate to use them to replace any of my previous thread-safety constructs.

How does an Actor protect mutable state if it can run isolated methods concurrently? Let's assume my isolated method has no suspension points (not using the await keyword inside of it).

This is shocking to me.

I'll have to think through this more, but at the moment given this information, I'm failing to see the benefit of them as well. Maybe I need to rewatch some wwdc videos and re-read some proposals...

As I understand it, barring reentrancy, an isolated method that never suspends will run effectively exclusively. However, access to that isolated method has no order guarantee, as other waiters may be resumed in arbitrary order. As far as I know, a method that doesn't suspend is the only way to get guaranteed atomicity with the actor, so it's the recommended and really only approach. Otherwise the actor ensures safety by guaranteeing exclusive access to its isolation domain, whether for method or property access.

1 Like

@Jon_Shier

I see this in the actors proposal:

The second form of permissible cross-actor reference is one that is performed with an asynchronous function invocation. Such asynchronous function invocations are turned into "messages" requesting that the actor execute the corresponding task when it can safely do so. These messages are stored in the actor's "mailbox", and the caller initiating the asynchronous function invocation may be suspended until the actor is able to process the corresponding message in its mailbox. An actor processes the messages in its mailbox sequentially, so that a given actor will never have two concurrently-executing tasks running actor-isolated code. This ensures that there are no data races on actor-isolated mutable state, because there is no concurrency in any code that can access actor-isolated state. For example, if we wanted to make a deposit to a given bank account account , we could make a call to a method deposit(amount:) on another actor, and that call would become a message placed in the actor's mailbox and the caller would suspend. When that actor processes messages, it will eventually process the message corresponding to the deposit, executing that call within the actor's isolation domain when no other code is executing in that actor's isolation domain.

This makes it seem like the actor processes messages in it's mailbox in the order that the messages were received. Is there documentation somewhere that says that is not the case?

Also I see this:

When that actor processes messages, it will eventually process the message corresponding to the deposit, executing that call within the actor's isolation domain when no other code is executing in that actor's isolation domain.

Is there a way to force the actor to process any queued messages and wait for them to complete? Because that would make it easier to write test code against actors.

For example, say I want to test this function:

extension BankAccount {
  func transfer(amount: Double, to other: BankAccount) async throws {
    assert(amount > 0)

    if amount > balance {
      throw BankError.insufficientFunds
    }

    print("Transferring \(amount) from \(accountNumber) to \(other.accountNumber)")

    // Safe: this operation is the only one that has access to the actor's isolated
    // state right now, and there have not been any suspension points between
    // the place where we checked for sufficient funds and here.
    balance = balance - amount
    
    // Safe: the deposit operation is placed in the `other` actor's mailbox; when
    // that actor retrieves the operation from its mailbox to execute it, the
    // other account's balance will get updated.
    await other.deposit(amount: amount)
  }
}

I can write this test and it passes 100/100 times on both iOS sim and macOS catalyst:

    func testDeposit1() async throws {
        let checking = BankAccount(accountNumber: 1, initialDeposit: 100)
        let savings = BankAccount(accountNumber: 2, initialDeposit: 0)
        
        try await checking.transfer(amount: 50, to: savings)

        let balance = await savings.balance
        XCTAssertEqual(balance, 50)
    }

However, my situation is a bit more like this where the transfer happens in a different thread:

    func testDeposit2() async throws {
        let checking = BankAccount(accountNumber: 1, initialDeposit: 100)
        let savings = BankAccount(accountNumber: 2, initialDeposit: 0)
        
        Task {
            print("-- about to transfer")
            try await checking.transfer(amount: 50, to: savings)
        }
        
        await Task.yield()
        
        print("-- about to get balance")
        let balance = await savings.balance
        
        XCTAssertEqual(balance, 50)
    }

testDeposit2 passes 100/100 times on iOS sim. But on macOS Catalyst it passes only about 60/100 times. But even when it fails I see:

-- about to transfer
-- about to get balance

I interpreted this as the transfer is queued up before the balance check. But maybe we cannot assume that? Maybe actors do sequentially process messages in order received, but the print statement in this case isn't telling the whole story?

The proposal is missing the fact that messages arrive to mailbox concurrently, regardless of their original, temporal, order of suspension from the caller. So await actor.doSomething() from two different points, close enough together, can execute in any order.

As for testing, it doesn't seem to have been a consideration in designing the feature, so I'm not sure there are any recommended patterns other than trying to structure tests so they don't care about specific ordering. But what you've developed is close to what I've seen in the community: yield the async work so all work can complete before the test assertions.

Personally I haven't gone further than seeing actors don't work for what I'm trying to do, so I don't have much hands on experience. Perhaps others can help with actual approaches.

That helps me understand. Thank you