[Pitch #6] Actors

After pondering on this more, I am pretty supportive of the proposal's model for protocol conformance. I appreciate its simplicity. Also, while I think it will have a pretty big impact on the library design community, it is also true that there are no actors currently in the system, and thus we will be building all new patterns and abstractions anyway, so it will probably work out ok.

To reiterate Doug's points, unifying across actors, classes, and other types seems like a useful thing, but it is "nice to have". The most important case of this is the client side problem, and this can be handled by letting actors conform to "will always be async" protocols:

@i_will_never_introduce_a_sync_requirement_in_the_future
protocol P {
  func f() async
  func g() async
}
actor MyActor : P {...} // ok

and maybe we can talk ourselves into not needing the attribute for protocols that currently only have async requirements. In any case, this is a refinement above and beyond the base actors proposal.

Thank you for pushing this forward Doug,

-Chris

1 Like

The key thing about distributed actors is that they may be remote, but there is also a pretty important use case for local instances of potentially distributed actors. Once you have that, people will want to write special optimizations for the "it is actually local" case, and it will be useful to be able to model that with a dynamically checked cast sort of thing.

-Chris

1 Like

Just getting around to reading this proposal (looks good so far), this isn't related to anything important in the proposal, but I noticed one of my favorite traps in one of the examples:

Unnecessary blocking with non-reentrant actors

Consider an actor that handles the download of various images and maintains a cache of what it has downloaded to make subsequent accesses faster:

// assume non-reentrant
actor ImageDownloader { 
  var cache: [URL: Image] = [:]

  func getImage(_ url: URL) async -> Image {
    if let cachedImage = cache[url] {
      return cachedImage
    }
    
    let data = await download(url)
    let image = await Image(decoding: data)
    return cache[url, default: image]
  }
}

The serialized execution of partial tasks on the actor ensures that the cache itself can never get corrupted.

That is quite the understatement, since the cache will actually never be modified in this example!

The Trap: Dictionary subscript with default value does not insert the default value into the dictionary.

6 Likes

I've been reading the actor proposals since the first and I've experimented with them on my own as well. I've come to a simple conclusion: the model pitched here is too complex and too limited for it to be understandable, teachable, or useful enough to most Swift users. I believe this for several reasons.

  1. Exposure of the isolation model directly to creators and users of actors significantly increases the learning curve for this feature. Removal of the explicit isolated and nonisolated keywords in this latest proposal simplifies the proposal but not the model itself.
  2. Using async for safety is unnatural, leading to the overuse of async functions, infecting everything that tries to use actors.
  3. The intersection of actors and the Sendable types of SE-0302 increases the learning curve even further.

Compiler-verified and powered thread-safety would be a huge boon to Swift. However, the value is severely diminished if using the feature leads to unnatural APIs, difficult to understand code, or a model that actually increases the learning curve over the current manual safety patterns. That is, actors should enable simpler development of thread-safe types and shouldn't require different usage than the thread-safe types we currently have.

We can see this in practice by looking at Alamofire's Request type. It protects itself in two ways: the use of an internal serial queue (targeted on another) and a state lock (the @Protected wrapper) around a MutableState struct which stores state which may be mutated between threads.

public class Request {
    @Protected
    fileprivate var mutableState = MutableState()
}

Part of the Request's job is to manage updates to an internal State enum as well as the currently active URLSessionTask. It exposes a cancel() method which mirror's the task's cancel() while also performing other internal bookkeeping. It's critically important all of this work is done synchronously and in a blocking manner. Therefore we use a closure API provided by Protected to do it atomically.

@discardableResult
public func cancel() -> Self {
    $mutableState.write { mutableState in
        guard mutableState.state.canTransitionTo(.cancelled) else { return }

        mutableState.state = .cancelled
        underlyingQueue.async { self.didCancel() }

        guard let task = mutableState.tasks.last, task.state != .completed else {
            underlyingQueue.async { self.finish() }
            return
        }

        // Resume to ensure metrics are gathered.
        task.resume()
        task.cancel()
        underlyingQueue.async { self.didCancelTask(task) }
    }

    return self
}

An actor implementation of this functionality would have several deficiencies, at least as I understand the proposal.

public actor Request {
    fileprivate var mutableState = MutableState()

    @discardableResult
    public func cancel() -> Self {
        guard mutableState.state.canTransitionTo(.cancelled) else { return }

        mutableState.state = .cancelled

        // No way to enqueue work?
        // underlyingQueue.async { self.didCancel() }

        guard let task = mutableState.tasks.last, task.state != .completed else {
            // underlyingQueue.async { self.finish() }
            return
        }

        // Resume to ensure metrics are gathered.
        task.resume()
        task.cancel()
        // underlyingQueue.async { self.didCancelTask(task) }

        return self
    }
}

There are several issues here.

  1. This method would only be useful in an async context with await due to the actor's isolation. This is an unnatural pattern (there's no actual async work from the user's perspective) that shouldn't be exposed to users of the API. At the very least I should be able to do the work necessary to hide that complexity from the user. Ideally the actor itself would treat my synchronous functions atomically automatically.
  2. This usage exposes me to reentrancy issues where the original does not (and I had to fight hard to fix them in the initial implementation). Again, I should be able to manually replicate this behavior or, better yet, have the actor do it for me. The proposal suggests there could be a solution but provides no example:

Generally speaking, the easiest way to avoid breaking invariants across an await is to encapsulate state updates in synchronous actor functions. Effectively, synchronous code in an actor provides a critical section, whereas an await interrupts a critical section. For our example above, we could effect this change by separating "opinion formation" from "telling a friend your opinion". Indeed, telling your friend your opinion might reasonably cause you to change your opinion!

  1. There doesn't seem to be way to enqueue work off the hot path. I don't want to runDetached here, I want to enqueue on the internal queue to run after my mutations are complete so the state isn't locked too long and the user's call of cancel() returns quickly.
  2. I can't expose a synchronous property at all. Once the user calls cancel() they should be able to check isCancelled == true. That kind of API doesn't seem possible at all.

This is long enough so I'll stop here. Am I way off base here?

5 Likes

Sounds like your Request type is basically a future. I agree that implementing futures with actors doesn't seem like it would lead to a good design.

Not really, no, aside from the fact it can produce asynchronous events and eventually completes. Overall, Requests encapsulate a variety of work, much of its async, including setup, encoding, response behavior customization, retry, response decoding, and completion. To me, this sounds like an ideal use case for an actor: a variety of mutable state that needs to be protected from access on multiple queues while also performing work on still other queues. So if that’s not a good use for an actor, what is?

I believe this has come up already in the previous thread:

I see that there could be a large potential for misuse of this kind, because actors provide a very compelling guarantee, perhaps too compelling one: everything just magically becomes concurrency-safe and it will be very natural to jump into modeling all possible types as actors. But I think that fundamentally, actors are about having a "mailbox" — so would it make sense for a Request to have a mailbox with various pending cancels() and resumes() interleaved? Most likely not, since Request is a state machine in that regard, so it should be modeled as a Sendable type instead, especially that you say:

What I imagine to be a good case for an actor is something like URLSession: it can accept independednt requests for creating various upload/download tasks into its "mailbox" and process them one-by-one.

That doesn’t really make sense to me, as the primary selling point of actors seems to be the protection of mutable state, and the APIs that perform that mutation, from simultaneous access. That’s the example given throughout the proposal at least: BankAccounts, for whatever reason, need to be mutated from multiple threads, and often between each other. That’s almost exactly the same use case as Request: the user mutates the instance to change behaviors, it’s mutated as data is returned and response methods are called, as it retries tasks, and as it communicates with another actor (the session). So no, it is not a data type, and it is not a plain Sendable type (though it likely will be Sendable, since pretty much everything in Swift will need to be).

Although rare, it does make sense to make multiple resume() and suspend() calls, sometimes simultaneously and from separate queues, depending on the complexity of the user’s request handling. We test that exact scenario. Our Session type also makes sense as an actor, though it doesn’t really mutate. It’s more about facilitating safe communication between the in flight requests and their parent Session, so that aspect will be useful.

In any event, my exact example isn’t really relevant unless there are misunderstandings of the actual model evident. My general points still stand, so it would be more productive to address those unless you have something concrete to add to the Request example itself.

1 Like

As to (2) I sort of agree. I would actually love this reversed so async/await was inferred and you'd specify something like "sync" in the rare case where you want to guarantee that this will stay synchronous. E.g.:

sync func f() { ... } // Will stay "sync"
func g() { ... } // Currently "sync" and optimized that way
func h() { ... calls an async function ... } // Inferred async

sync g() // Call g(), but fail at compile time if g() is now async

All method invocations would be potential suspension points, except if prefixed with "sync".

I realize that ship has sailed, though :smiley:

  1. Couldn't you just make cancel() nonisolated (i.e. "sync"), and then use runDetached() in the implementation?
  2. I guess the solution is nonisolated / "sync" functions to guarantee no suspension points (and hence no re-entrancy)?
  3. Why don't you want to use runDetached()? As I see it you want Request to have a mix of an async and a sync interface. Actors are all about async interfaces, so perhaps you should have an Actor with a Sendable "MutableState"? cancel() could then be synchronous, update the mutableState synchronously and then use runDetached() to execute the remaining work asynchronously. And perhaps move as much as possible out of the MutableState.
  4. I think I've seen getter-only properties being pitched?

The ImageDownloader example in the proposal gives me similar vibes. The text says:

At worst, two clients might ask for the same image URL at the same time, in which there will be some redundant work.

To me that is a pretty fundamental issue with the image downloader example. If you have 10 identical images on a page and you ask for them upon loading the page, you don't want to issue 10 requests. This downloader actor is concurrency-safe and non-blocking, but not well-behaved.

The proposal says this is better than if the actor was non-reentrant (because then it'd serialize each request). I'd argue neither the reentrant or non-reentrant versions are good: they each choose a different and unnecessary tradeoff.

A proper implementation would store future-like task handles in that cache so you're able to await on an existing request already in flight.


I don't know if that means actors are deficient. It might just mean overenthusiastic people will try to implement things using actors (like that downloader in the proposal) only to end up having to use other means because it falls short of their expectations. It sounds quite possible to me that we're overhyping actors and their capabilities right now. They seem quite limited to me, which is not necessarily a bad thing as long as people realize that.

2 Likes

In a (model, view, controller) scenario; wouldn’t Request be more of a model than a controller?

My high level understanding is that Actors are more of the controller than the model. Session fits perfectly as an Actor.

I do think you have some good points about what is the high level scenarios where Actors are meant to help.

I would love to see an example from the authors for converting an mvc app to use Actors. I would imagine that View is not going to be an actor but ViewController would be an Actor.

1 Like

Interesting you have that impression, the first example the pitch gives is of model objects (a BankAccount) as actors.

Can anyone articulate the incompatibility between @Jon_Shier’s type or Future and actor concepts?

The pitch seems to give a definition of actors that seems compatible with those types but we have some of the issues @Jon_Shier raised and @John_McCall saying actors don’t suit Futures. What opinions about app architecture are actors subtly imposing that “a bag of state is held within a concurrency domain and then define multiple operations that act upon it” does not give?

1 Like

My main point was that actors shouldn't expose their internal details to consumers. That is, consumers of actors shouldn't have to care about how to call synchronous looking methods because they internally touch the actor's mutable state. Unless the author of the actor wants async APIs, they shouldn't be forced to have them just to take advantage of the protection actors offer.

No, as that isn't real synchrony. As soon as this method completes, isCancelled should be true. That's important both for the consumer as well as internal work already enqueued which needs to check the value in order to stop processing.

Yes, something to that affect. Ideally this would just be automatic with sync methods on actors, but having to internally demarcate such sections would also work. That's what I have to do today with the lock.

AIUI, runDetached isn't just asynchronous, it can also run in parallel on another queue. This violates the safety I need. Something similar that simply asynchronous on the actor's internal queue would work fine, as that's essentially the pattern I have now.

I've also seen effect-ful properties pitched, which would be better but doesn't actually solve my issue. As I understand it, a nonisolated property wouldn't have access to the internal state I need without providing my own lock. At that point I've lost much of the value actors provide. And my overall point is that I shouldn't have to offer an async API surface just because I'm an actor, properties included.

Yes, that example isn't very realistic. It would be better if the proposal was updated to use the Task.Handle storage example demonstrated to me in an earlier pitch thread.

If the model can't be demonstrated with realistic examples, then I'd consider it deficient. Protection of mutable state that's only useful in certain cases that aren't described in the proposal would seem to require further definition at least.

Again, no. In this case, a model would be something like URLRequest or any of the other various model's Alamofire uses internal to describe the Request's settings. As I've describe, Requests encapsulate several long running tasks (most notably, the actual URLSessionTasks being performed) with an eventual asynchronous completion. That is far more controller than model. Session will also benefit from automatic thread-safety but actually performs far less work on different queues.

That's actually use case I don't think is well suited to actors, as you really shouldn't be mutating view controller state from multiple queues. However, I do think integrating how UIViewController will change when Swift concurrency lands (if at all) would be very illuminating and should be part of a proposal or forum thread at some point.

1 Like

runDetached creates a new task that's independent of the current task (if there is one). It doesn't override isolation rules; if you use that task to, say, call an actor function, that part of the task will of course run on the actor.

1 Like

Interesting. So the body of the closure could (would?) run in parallel with other executors, but my calls to the actor itself are executed on that context? I think that would replicate the queue.async behavior I need. I still think it would be better to have actor API dedicated to enqueuing things internally without the need to create a separate Task that just calls back to the actor, but at least the behavior's there.

Further to this, if there is some common patterns the application of actors to which are not suitable, a section in the proposal should outline those (ideally, it should explain why). This may help explain how actors should be applied, just like studying fallacies helps explain clear thinking.

For example, if the concept of futures do not appropriately match with the concept of actors, the proposal could say, “a future should not be considered an actor because it represents a single value, where an actor is a box of mutable state which changes between values. Implementing futures as actors results in awkward, counter intuitive API design” (this isn’t a particularly good example for me because I don’t see how the implementation of a future whose state is modelling the existence of a value is disjoint from the concept of an actor).

It's the fundamental shape of an actor that its API is async. Especially for distributed actors. But the current proposal does allow for some synchronous API (at least last I checked).

So what I'm proposing is to have a synchronous cancel() that will update the state protected by the lock, so isCancelled is true, and then use runDetached() to notify listeners, update the non-sync state etc. So cancel() will be sync, exit without any suspension points, and isCancelled will be true immediately after.

So as @John_McCall already mentioned this does not violate the actor's safety.

Swift Concurrency is of course fundamentally different from using threads or libdispatch, although it looks almost the same. What previously would block (bad) will now cooperatively allow progression of other work (good), and where you would previously need to take care to specify the thread or queue to run something on, the actor now takes care of that and you can use it from any thread. And where you would previously need to be careful with how many threads you spawned, this is now automatically limited to the number of CPU cores.
So I guess you could say that queue.sync() is not needed anymore, and queue.async() is basically just runDetached().

But the most fundamental difference is the cooperative multitasking. Previously we would allocate a separate thread or queue for each distinct piece of work, pretending that it would run in parallel without affecting other work. But of course that doesn't scale; if you have enough work you need the scheduler to force "suspension points" in random places, and the "run in parallel" becomes an illusion. What's worse is that the system really has no idea of the structure of your work, so if a libdispatch worker thread is blocked it needs to spawn more worker threads to ensure the global progression of work.

As I see it, the achilles heel of cooperative multitasking is tasks not cooperating. I.e. long-running tasks with no suspension points, or tasks blocking using "old school" mutexes and the like.

So please move as much state as possible out of MutableState, and please only do the minimal required work while holding the mutex :pray:

1 Like

Even if that's true, that doesn't mean it's a good direction for Swift. I'd rather the feature fit the language than the language adapt to suit the feature. Given that Swift has existing async patterns in common use it seems better to adopt those patterns than build completely new ones. It certainly seems to me that a hybrid approach gives us the best of both worlds: compiler checked safety with an easy to use model. (And the current proposal has removed the isolation keywords, so there's no isolation control officially.)

I'm not sure why you wrote those paragraphs, as Alamofire has already adjusted to the limitations of GCD in version 5. A single serial queue is now shared between the Session and all Requests, will all mutations that aren't synchronous or from outside the framework being done on that queue. MutableState is locked to prevent reentrancy in certain scenarios and to ensure users see state changes immediately after calling particular APIs. Internally this isn't fundamentally different from the actor model proposed, I just don't expose the async nature of my state updates to my users.

Sorry, this was more of a general warning for others going to use a mutex in an actor, which is usually not the best design. I'm sure you have been shaving your MutableState down to the bare minimum :+1:

In Alamofire's current design, MutableState is fairly large, as it encapsulates all of the state that has public API allowing it to mutate from multiple queues (all user facing API). Accesses are engineered to ensure minimal work is done while holding the lock (a wrapper around os_unfair_lock), which is why the lifetime methods are actually in underlyQueue.async calls in the first place. This ensures the state updates are atomic and block other mutating methods from issuing their lifetime methods until the first mutations are done. There's really no other option here.

I was hoping actors would solve a lot of this complexity but without the ability to offer synchronous API that automatically integrates with the actor safety mechanisms, it seems I have no choice but to continue using a lock internally. That greatly reduces the value of actors altogether, which is why I think they need synchronous capabilities in addition to the async ones proposed. Additionally, when I raised this point in one of the first pitch threads, a lock was suggested a way to manage my own state outside of the isolation model. Whether that recommendation has changed, I don't know. We'll find out in the specific isolation API pitch.

2 Likes