Actors 101

rayx · August 13, 2024, 1:30pm

I think I understand what @crontab meant. Let me try to express it in a different way:

It's meaningless trying to distinguish concurrency from parallelism in Swift, because it's handled by the language runtime, based on hardware configuration, and is transparent to user and developer.

Having said that, since parallelism is one form of concurrency, I think we should use concurrency in general.

I don't think that's the right way to think about it. In my opinion, actor's code should be understood to run in sequence. There is no concepts like parallelism or concurrency involved at all.

ktraunmueller · August 13, 2024, 1:31pm

This (flag) could be very useful for debugging / understanding issues in concurrent code, similar to the challenge Sean Parent talks about here.

nkbelov · August 13, 2024, 1:55pm

I'm not sure that "there's no [...] concurrency involved", since actor reentrancy means essentially that another function may start running on an actor before the first one has finished (...so isn't this concurrency...?), but to be fully fair, after looking up, the original actor proposal calls this behaviour as "interleaving" to perhaps underline that it can only occur at await points and not arbitrarily.

crontab · August 13, 2024, 2:01pm

Yes but my problem with this terminology is that it kind of makes you dismiss the fact that there's real multithreading involved underneath Swift's structured concurrency. Imagine we are back to pre-concurrency times in Swift, you don't create threads left and right for no good reason, do you?

Similarly actors (to bring us back to the original topic) should not be thought of as something lightweight. In fact actors do create true parallelism and you need to consider whether you actually need it, almost the same way you'd think twice before creating a new thread (or a global GCD queue) in the old code.

Yes, exactly!

rayx · August 13, 2024, 2:35pm

Reentry doesn't require concurrent code. It can happen in single thread. I think recursion is a special form of reentry, for example. You can find more information on wikipedia.

My guess is there is only one "logical" thread representing an actor (I guess the underlying physical thread isn't fixed). I don't think a same actor's code can run on different thread simultaneously. So, yes, I believe "interleaving" can only occur at await.

Of course, coroutines running in a single thread can give an illusion of concurrency, but that's from the general perspective of how coroutine works. In the special case of actor, however, I'd understand it from the perspective that they are in a single thread and say actor code run in sequence.

jaleel · August 13, 2024, 8:02pm

Think OOP is a "disaster" only because of enterprise Java and all this Clean code/SOLID stuff. Will again quote Alan Kay:

OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things. It can be done in Smalltalk and in LISP. There are possibly other systems in which this is possible, but I'm not aware of them.

Personally would prefer functional + imperative programming for apps, but representing those apps, or part of them, as objects (or actors) to communicate—still feels like an interesting idea.

Actors are lightweight though. This:

actor Counter {
  var value: Int = 0
  
  func increase(from counter: Counter) async {
    self.value = await counter.value + 1
  }
}

let actors = (0..<1_000_000).map { _ in Counter() }

for i in 1..<actors.count {
  await actors[i].increase(from: actors[i-1])
}

await print(actors.last?.value)

try await Task.sleep(for: .seconds(10))

will consume 135mb and 1 core with 3 threads, at least for me in debug mode, and will finish everything quickly.
I think it's better not to compare building in Swift Structured Concurrency with GCD, tbh.

Wiki states differently, though

John_McCall · August 13, 2024, 9:51pm

Folks, if you find yourself getting exasperated at a conversation, please just step away from the thread for a bit. You can decide to come back later if you like.

ibex10 · August 14, 2024, 12:41am

Something to lower the temperature.

I am trying @jaleel's example code above, but I am observing a strange problem.

The execution seems to get stuck around the for loop in the following code if the first Task.sleep is disabled.

@main
enum Driver {
    static func main () async  {
        actor Counter {
          var value: Int = 0
          
          func increase(from counter: Counter) async {
            self.value = await counter.value + 1
          }
        }

        let N = 5
        print ("--> N", N)
        
        let actors = (0..<N).map {
            print ($0)
            return Counter()
        }

        print ("--> spawning \(actors.count) actors...")
        #if true
        try! await Task.sleep (until: .now + .seconds (5))
        #endif
        
        for i in 1..<actors.count {
          await actors[i].increase (from: actors[i-1])
        }
        print ("--> awaiting the last actor's value...")
        
        if let value = await actors.last?.value {
            print ("--> value", value)
        }

        try! await Task.sleep (until: .now + .seconds (5))
        print ("finished")
    }
}

With the first Task.sleep enabled:

--> N 5
0
1
2
3
4
--> spawning 5 actors...
--> awaiting the last actor's value...
--> value 4
finished

Now, with the first Task.sleep disabled:

--> N 5
0
1
2
3
4
...
// 10 minutes later still nothing

Any ideas why this might be happening?

PS: I am on a macMini 3.2 GHz 6-Core Intel Core i7; Xcode Version 15.4.

vns · August 14, 2024, 6:31am

That's an interesting behaviour, I have no idea why it stuck – it should not in my understanding. I have yours and @jaleel examples working fine on M1 Pro with and without sleeps.

jaleel · August 14, 2024, 7:48am

Yeah, that's surprising, both works for me, also on M1 MacBook.
If problem is reproducible will suggest to file an issue.

ibex10 · August 15, 2024, 11:46pm

Here is something that might be useful, detecting reentrancy (interleaving) in actors, posted by @aetherealtech.

actor World {
  ...

  private var occupancy = 0 {
    didSet { if occupancy > 1 { print("Re-entrance detected") } }
  }

  func doSomeWork(with entity: Entity) async {
    // Some synchronous work
    occupancy += 1
    await doSomethingAsync()
    occupancy -= 1
    // More synchronous work
  }
}

ibex10 · August 15, 2024, 11:55pm

Here is something enlightening to read. Thank you, @aetherealtech, for writing this Concurrency 101 material.

A system of concurrency guaranteeing order is a contradiction. Concurrent means in parallel, which implies no guarantee of order. Guaranteeing order just reintroduces seriality. What are people asking for when they ask for a concurrency system to makes order guarantees?

The answer usually seems to be something about the order tasks are "started", in contrast to full serialization which implies one task is started and finished before the second task is started. First of all, this isn't something new to Swift concurrency. In this code:
  Thread.detach { print("Hello 1") }
  Thread.detach { print("Hello 2") }
The order of the print statements is indeterminate. That's the whole point of spawning new threads. There's no guarantee that the first thread "starts" before the second thread "starts".
...

crontab · August 16, 2024, 6:24am

Wow, good to know I'm not alone And yes, a great comment by Dan.

ibex10 · August 24, 2024, 12:36am

Here is more good-to-know stuff.

A non-isolated async function runs on a special non-actor asynchronous context,

More

async functions always define where they run (which is the opposite of how systems like libdispatch work). nonisolated doesn't opt out of that, it just defines it as "the place where I run is not on any actor".

More

Which isn't true anymore because of isolated parameters with #isolation
More

crontab · August 24, 2024, 6:33am

Controversial opinion (I like controversy!)

The async let proposal says something that is quoted often:

Task.detached most of the time should not be used at all, because it does not propagate task priority, task-local values or the execution context of the caller. Not only that but a detached task is inherently not structured and thus may out-live its defining scope.

I think "should not be used" is a pretty strong word for an instrument that creates true parallelism and does it in a safe manner albeit not "structured" as this quote says.

My (controversial) take on it is: do use Task.detached whenever the compiler allows you to, possibly tied to some global actor (most likely not MainActor), or not. In more general terms: if something can be done in parallel, should be done in parallel. In the era of multicore CPUs everywhere, this allows your code to use the available cores more efficiently.

In fact sometimes Task.detached can replace a whole actor. If you can reduce your actor to a function by moving the state onto the stack, then Task.detached is preferred to having an actor.

One example of where Task.detached would be appropriate is downloading and uncompressing a media file (video, image, or audio) before playing/displaying it. You don't need an actor for that since the execution state can be kept on the stack and therefore the entire sequence can be reduced to an async function that can (and should) be called on a detached task.

(Although synchronizing with the UI as well as possibly cancelling such tasks can be tricky anyway, whether they are implemented in actors or detached tasks.)

Jon_Shier · August 24, 2024, 8:15am

Actually, Swift concurrency isn't generally suitable to long running blocking tasks like decompression, as it shares a single fixed width thread pool. Technically you could create a custom executor that is backed by other threads or queues, but there's no way for us to replicate the fixed width queue underlying the concurrency system, except manually. If you want guaranteed parallelism where you also control the execution width, your best bet is DispatchQueue.concurrentPerform with your desired width, where each parallel bit of work is contained within its own continuation. I really wish we had more tools here.

crontab · August 24, 2024, 8:57am

I'm not sure I understand why. Firstly a good implementation would split something like a decompression of a long media file into multiple chunks.

But even if run continuously and as a whole, imagine say a 4-core CPU where Swift's concurrency system runs 4 threads vs. Swift's 4 threads plus another thread that you started via DispatchQueue. It seems to me using Swift's thread pool (i.e. Task.detached) is more beneficial.

florianpircher · August 24, 2024, 9:42am

Swift Concurrency is not designed for long-running/CPU-bound tasks such as decoding large files. That would block one of the threads in Swift Concurrency’s thread pool. As long as only one thread is blocked, you would not notice the issue too much on a multicore system. But as soon as multiple subsystems start blocking threads, the model of Swift Concurrency starts to break down.

Tasks create an async context, but they do not dictate where all of their code runs. For example, a task can initiate a data decoder that is internally implemented to not use Swift Concurrency’s thread pool. You do not need to use Task.detached vs. Task.init since the only code affected by that choice will be the code calling the decode method and the code receiving the decoded result, but not the decoding function itself. (At least in terms of isolation; priority and other task attributes also affect the functions you call from a task.)

Tasks should only do the orchestration; the actual long-running code must be implemented separately from Swift Concurrency.

I agree. Swift Concurrency should have a story for performing work that blocks a thread. DispatchQueue.concurrentPerform and company are usable tools, but they feel disconnected. People (somewhat justifiably) expect they need to stay within the tool set of Swift Concurrency to do it the right way™, but currently that is not the case.

mattie · August 24, 2024, 10:22am

I just want to add that this is confusing the concept of "non-isolated" with Task.detached. It is true that using Task.detached stops any actor inheritance. But, in all situations where that is the only goal, a nonisolated function with a regular, non-detacted Task is a better choice.

mattie · August 24, 2024, 10:35am

I'm going to offer my own controversial opinion here.

I don't think is so bad to run CPU-bound work on nonisolated contexts. The MainActor cannot be blocked by this. As long as your threads are making forward progress, I think you are probably ok.

Sure, the system now doesn't have room to take on yet more work. But, it is already maximally busy. I believe if you need to do more work, at this point you might have a user-interface problem instead.