[Concurrency] Actors & actor isolation

I agree that this is a critical problem and that solving it is a huge opportunity and will be a huge contribution for Swift concurrency. Unfortunately, "phase 1" of the plan here doesn't solve this. It is introducing a memory unsafe actor model akin to Akka actors. This is a very useful step in that it provides a design pattern to help structure concurrent code, but is not far enough IMO. Also, taking a half step here will introduce serious problems with getting to the memory isolation and race safety.

As I mentioned in the roadmap thread, we can pretty easily fix this. I will try to get an outline of this together today or tomorrow to share with the community. UPDATE: it's in this thread.

Here is a detailed review of this draft of the proposal. I include a few large topics that need detailed discussion on their own, then a number of smaller points at the end. I'm really thrilled to see the progress in this area!


actor class vs actor

Much of the discussion upthread is about actor class vs actor. I tend to agree with people that actors are primal enough to be worth burning a keyword on, here is some rationale:

  1. The documentation and diagnostics will inevitably all talk about "actors" and not "actor classes", so it makes sense to align the language with this.
  2. Actors can't subclass classes and visa-versa, they are a "different thing"
  3. They are a "another kind" of reference type in Swift (along with classes, functions, unsafe pointers, etc).

At the very least, I would recommend capturing some of the tradeoffs in alternatives considered section. On the flip side, calling them actors means we would have to survey all of the places we use classes and reconsider them, e.g. class methods, how to rationalize actors subclassing NSObject (see below), etc. I think it is reasonable for actors to not have static members and class methods though, as the whole idea is to get rid of global state.


Separating access control from async for cross-actor reference validity checks

"Synchronous functions in Swift are not amenable to being placed on a queue to be executed later. Therefore, synchronous instance methods of actor classes are actor-isolated and, therefore, not available from outside the actor instance." ... "It should be noted that actor isolation adds a new dimension, separate from access control, to the decision making process whether or not one is allowed to invoke a specific function on an actor. " <== Please let's not do this! :slight_smile:

I mentioned this to John previously, but it seems better to keep access control orthogonal to cross-actor reference issues. The proposed design will end up producing a lot of async wrappers for sync functions just to allow those sync functions being called across actor boundaries. There is no need for this boilerplate:

actor class BankAcount {
   .. state..

   // This method is useful both within and from outside the actor.
   public func computeThing() -> Int {
      ...
   }

   // I need to manually write a wrapper, and now I have a naming problem.  :-(
   public func computeThingForOthersToUse() async -> Int {
     return computeThing()
   }
}

Instead, I'd recommend make the model be that cross-actor calls are defended by access control like normal, and a cross actor call to a sync function is implicitly async (thus requiring an await at the call site):

   // some other actor can call the sync function, because it is public!
   await use(myBankAccount.computeThing())

The compiler would synthesize the thunk just like it does reabstraction thunks. This provides a more consistent programming model (not making our access control situation more complicated) and eliminates a significant source of boilerplate. Similarly (as part of the base async proposal), it should be possible to fulfill an async requirement in a protocol with a normal sync method implementation.

This realigns the async modifier on actor methods to be about the behavior of the method, not about whether it can be called by other actors, which is what access control is about.

Your deposit(amount:) example is a great illustration of the problem here: there is nothing about its behavior or implementation that leads to internally suspendable. Declaring it as async means that any intra-actor calls will have to await it for no reason.

Furthermore, doing this solves a significant amount of complexity elsewhere in the proposal: accesses to cross-actor state (whether it be let or var) is gated simply by access control. Any cross-actor access would be correctly async, and synchronization in the most trivial cases allowed by the proposal would be optimized out by the compiler using the as-if rule. This keeps the programmer model simple and consistent.

More related points in the "let" section next:


Cross actor let property access

I am very concerned about allowing direct cross-actor to let properties, because we don't have the ability to support computed let properties. Allowing this will harm our API evolution of properties: we currently allow things to freely move from let properties to vars with public getters, but this will break that. I don't think that "let-ness" is resilient across library boundaries at all right now (for good reason).

Furthermore, as you mention, reference types completely break the actor memory safety guarantees here, the entire stated purpose of this proposal. :-) You don't want cross-actor uses of this thing to have access to data your mutating within the reference type. You need something like the reference type proposal (which I'm hoping to work on) to gate this.

I feel like you're trying to walk an awkward line here, and I don't think it will work well: actors are supposed to be islands that can only be "talked to" asynchronously. The "let's and @actorIndependent things can be talked to synchronously" breaks the contract and muddles the water.

Overall, I would recommend subsetting this out of the initial proposal and discussing it as a later extension. It isn't core to the programming model, and introduces a lot of issues.


Global actors

On global actors in the detailed design section, I don't understand the writing and what is being conveyed here. There are both small and large examples of this. Some larger questions:

  • What does "The custom attribute type may be generic. " mean? Does this mean that @globalActor struct X<T> { is allowed? If so, the semantics are that there is one instance of the actor for each dynamic instantiation of the type T, right? I think that this is required because shared will be instanced multiple times.

    This is a very powerful capability: is there a use case for it? If not, I'd recommend subsetting it out of the initial version of the proposal, it can always be added later.

  • I don't understand what this means: "Two global actor attributes identify the same global actor if they identify the same type." Don't they have to be lexically identical attributes if that is the case?

  • There are some implied semantics of a declaration being marked as a global actor, but I'm not sure what they are.

  • The whole discussion of "propagation" of the global actor attribute is vague and I find it to be confusing.

I would recommend splitting this whole discussion of global actors out to its own sub-proposal. The issues involved are complicated and could use its own motivation, examples, and exploration to develop it, and this is additive on top of the base actor model. To be clear, I'm not saying that we should adopt actors without solving this proposal, I just think that it would be easiest to review and discuss it as a separate thing.


Other

Some more minor comments and questions:

  • Writing/framing nitpick: "The primary difference is that actor classes protect their state from data races." --> I don't think this is the primary difference between actors and classes. The primary difference is that actors have a task/queue associated with them, and they are used as a design pattern in concurrent programs. Actors are not guaranteed to protect state, e.g. in the face of unsafe pointers.

  • The behavior with escaping closures and actor self makes sense to me.

  • The "Escaping reference types" section is really troubling as I mentioned at the top. I don't think that this proposal can stand alone without a solution to this problem.

  • Actor isolation also needs a solution for global state like global variables and static members of classes. I don't think the "proposed solution" section or "detailed design" touches on this at all.

  • Another writing issue: The discussion of @globalActor and @UIActor in the "proposed solution" section is too vague for me to understand it.

  • "As a special exception described in the complementary proposal Concurrency Interoperability with Objective-C, an actor class may inherit from NSObject." --> It isn't clear to me why this is needed. Isn't enough to mark the actor as @objc? I thought all @objc things already inherit from NSObject?

  • As I mentioned above, I think that the way you are conflating access control with cross-actor references is confusing and problematic. This @actorIndependent attribute is another example of this. I think this whole topic needs further consideration. Flipping the behavior as mentioned above seems like it would simplify the proposal significantly, by relying on our (already overly powerful) existing access control mechanisms.

  • Shouldn't the closure parameter to run be @escaping? If not, you can trivially violate the actor safety properties due to the self capture rules described earlier in the proposal.

  • On enqueue(partialTask:): I love that this is user definable. Why can't it be marked final? This seems like it should only be defined on root actors though. I'd love to see a longer exploration of this topic on its own, because just this single method has a huge set of tradeoffs that are worth exploring.

  • "Non-actor classes can conform to the Actor protocol, and are not subject to the restrictions above. This allows existing classes to work with some Actor-specific APIs, but does not bring any of the advantages of actor classes (e.g., actor isolation) to them." Ok, out of curiosity, why is this important? I can see the utility of having an actor protocol that unifies all the actors, but I don't see why it is useful for normal classes to conform. I also don't see any harm, just curious what the utility is.

  • As I mentioned a couple times above, I would rather not have @actorIndependent at all, I'd rather that cross-actor accesses be gated by normal access control, and any cross-actor reference just being async. This seems like it will lead to a simpler model, less boilerplate, and less language complexity.

  • I also don't think there is any great need to have actors be able to provide non-async protocol requirements. This seems directly counter to the approach of actors. Such a need can be handled with simple struct wrappers, which seems like it would factor the language complexity better.

  • I don't understand what is being conveyed in the "Overrides" section. An example would be very helpful.

Overall, I'm very very excited to see the progress on this. This is going to transform the face of Swift programming at large!

-Chris

22 Likes

My first thought was that developers can manage this interleaving just fine. And it could actually be a nice challenge :slight_smile: But I fear bugs due to interleaving can be harder to detect and debug than deadlocks.

func f() async {
  guard ...something depending on the actor's state
  ...work relying on the guard
  g() // Call to ordinary sync function
  ...work relying on the guard
}

then some future maintainer makes g() async and fixes all call sites in a big sweep, so this becomes

func f() async {
  guard ...something depending on the actor's state
  ...work relying on the guard
  await g() // Call to async function
  ...work relying on the guard **ASYNC FAILURE**
}

This potential bug can be elusive, and potentially hard to debug.

6 Likes

I don't think this is worse than a completion callback though. Consider this:

func f() async {
  guard ...something depending on the actor's state
  ...work relying on the guard
  g() { // Call to function doing async work, calling closure on completion
    ...work relying on the guard **ASYNC FAILURE**
  }
}

With the callback you don't really know from looking at the code whether it is synchronous or not. With await it's clear there's suspension point there. So in theory, if you can get accustomed to await, things become easier to read than what we have now.

I agree this potential bug can be elusive and potentially hard to debug. It's not really a new thing however. The function should recheck its invariants after each await, or each callback, if it still depends on them.

There's no solution to this if you want to things to stay asynchronous. If you block other tasks from running on the current actor, g() simply becomes a synchronous call.

1 Like

Yes, g() becomes a synchronous call as seen from the actor; not otherwise. And that's the whole point of async/await in my opinion. That you map concurrent code to a simplified sequential view where it's easy for humans to reason about what happens. The actor isolation proposed here is sort of the missing link that makes async/await work as it should :slight_smile:

It would be awesome if both modes were supported initially, though. Either by having a more verbose way to introduce a suspension point where reentrancy is allowed, or by somehow marking an atomic block of code where async calls will not allow reentrancy.

1 Like

FYI class constraints were merged with AnyObject in SE-0156, and AnyObject is now the preferred spelling.

3 Likes

Thinking about this interleaving issue at suspension points - maybe an await could throw or somehow indicate to the caller when interleaved partial task changed common state in order to refetch state or change their computation?

That sounds complicated. If we need to check for mutation after every partial task, we might as well assume that they always mutate. And there's no distinction between an incomplete mutation (that needs guarding) and a completed one, is there?

It may be useful to have an explicit barrier, but that is no different from synchronously wait for the async task, which should already be provided in some form.


+1. The touchEnded example also has similar problem:

@UIActor func touchEnded(...) { ... }

@UIActor func touchEndedAsync(...) async {
  touchEnded()
}

Having both versions is of little use. You have essentially only one choice in any scenario (different actor, same sync actor, etc). It's all busy work here even if we use the same name* for both sync and async functions. While touchEnded is an event handler and probably is't called directly, ui-related functions like updateUserUI is also a good case.

* Overloading doesn't help here. The compiler could misinterpret as calling async version, which it currently does, causing infinite-loop.

(hopped from the other thread, if you know what I mean :smirk:)

The code in question:

@MainActor
func foo() async {
  let a = await task1()
  let b = await task2()
}

Is there a case where it wouldn't be optimized? Since we could hop directy from

Task1Actor -> Task2Actor

instead of

Task1Actor -> MainActor -> Task2Actor

There will definitely be some changes in ordering, but I don't yet see where that could be a problem.

That example should be reliably optimized because there’s no significant code between two calls with known actor-independence. We may have to impose some high-level rules about when we can reasonably assume actor-independence, or else we’ll find ourselves completely blocked by theoretical actor dependencies like, say, a global function reading from actor-protected state somehow.

2 Likes

I don't know of a way to express it in this pitch's design so I'll modify the design a bit to demonstrate:

/// Instead of applying a global actor attribute such as `@UIActor` an actor conforms to this protocol
/// and specifies the global actor context using an associated type
protocol GlobalActor: Actor {
    associatedtype ActorContext
}

This change in the design allows me to express this:

final actor class Store<Value, Action, ActorContext>: GlobalActor { ... }

Instances of this type are bound to a global actor without knowing which global actor. As far as I can tell this is not possible with the current pitch for two reasons: you cannot constrain ActorContext to be an @globalActor and you cannot apply an @ActorContext to the Store class.

With this modification, we are also able to apply constraints on the ActorContext of a GlobalActor. For example, you could write code that is generic over another actor, but must have the same execution context as Self, or a concrete known execution context such as UIActorContext. The compiler could take advantage of this to allow synchronous access to synchronous API of actors from other actors in generic code (as long as they are constraints dot share the same ActorContext). I don't see any way to support this in the pitch as written. Here's an example:

struct SomeGenericView<A: ObservableObject>: View 
  where A: GlobalActor, A.ActorContext == UIActorContext 
{ 
   @ObservedObject let actor: A

   // The compiler would need to know this can only be called on main / UIActorContext.
   // I'm not sure how to express that...
   var body: some View {
      // sync access to members of the actor made available via additional constraints
   }
}
3 Likes

I think if we were going to allow this sort of actor-generics, it would need to support actor classes as well; can you figure out a way to make that work?

The Store in my example is an actor class so I don’t understand the question. Can you elaborate?

Another example from my library is:

final actor class TupleStore2<Store0: GlobalActor, Store1: GlobalActor>: GlobalActor
  where Store0.ActorContext == Store1.ActorContext
{
    typealias ActorContext = Store0.ActorContext
}

In my library TupleStore2 is actually a struct. Ideally a heap allocation could be avoided but I'm not sure how that could be expressed in terms of the current pitch.

If I understand correctly, you are trying to write generics that are generic over an actor. But it looks to me that in general you’re trying to just carry the actor as a type, which will only work for global actors.

I had an idea that didn’t make it into these pitches of an “actor accessory” type, which could have a let actor property that would dynamically specify the right actor. That approach seems to still allow the sort of static reasoning that you want while also being theoretically extensible to generics.

I don’t understand why Store itself is an actor in your example.

If we wanted to extend my design to support a unique queue per-instance like the default actor class I think we could do that like this:


/// Only `Never` conforms
protocol _ActorContext {}

/// User-defined global actor context types conform to this protocol:
protocol GlobalActorContext: _ActorContext {}

// If necessary, this protocol could get special treatment allowing for existentials
// until generalized existentials are a feature
protocol Actor {
    // If users ever specify an explicit ActorContext without conforming to `GlobalActor` 
    // they get a warning or error
    associatedtype ActorContext: _ActorContext = Never
}
protocol GlobalActor: Actor {
    associatedtype ActorContext: GlobalActorContext
}

With this design in hand, my Store class would use a conditional conformance:

final actor class Store<Value, Action, Context> { ... }
extension Store: GlobalActor where Context: GlobalActorContext {}

API that needs to rely on actors sharing a serialization context would be written to constrain the context of both actors to be identical and to conform to the GlobalActorContext protocol (and therefore only available on some instances). Any API whose implementation does not rely on that constraint would be available on all instances, even those with a unique serialization context.

It is a class that serializes access to state. I don't think the details very relevant to this discussion.

Can you explain what an ActorContext is that’s different from an actor? Feels like there’s just a lot going on here that isn’t clicking for me. You just want a type that serializes work but sits parasitically on another actor?

First, i'm amazed that swift get to have actors. Congrats to the team. IMHO this singlehandly could renew people's interest for the language in a server-side environment. Now for my question:

What's the target number of actors expected to run in a given system (let's say order of magnitude per cpu) ? From my understanding, great care has been put in the proposal to separate the actor definition system, from the thing actually running the functions (the task executor if i understand correctly). However, i suppose that authors of the proposal had at least some ideas of the types of usages, and the acceptable performance tradeoff they're ready to take. For example :

  • in a video game server, are we expected to spawn : one actor per "arena / map / game" ? , or one actor per "connection / player", or one actor per map item (every object in the world is its own actor) ? This could be the difference between having 10 actors per CPU, to 1000 to a million.
  • in a social network mobile app : are we expected to run one actor for UI (default) and one actor for background tasks (let's say, server side communication) ? or one actor per friend ? or one actor per message (dealing with its own status, like, reply, etc). Here again, the order of magnitudes could be widely different.
3 Likes

An ActorContext is modeling what your pitch calls a "global actor" in the type system. My design would have UIActorContext instead of @UIActor. An "actor" would be a class instance that lives within the rules of the actor system. An "actor context" would be the serialization context in which actor instances can run.

This distinction is already latent in your design where actors that are annotated with @UIActor live in a shared serialization context. I think it's useful to make it explicit.

I recall hearing of or reading about an actor system that had a similar notion. They called it a "vat" - a vat could have many actors in it and it provided a serialization context for all of those actors, allowing code to take advantage of the knowledge that this context is shared. I haven't been able to dig up any references on this, but have used the idea in my own code and found it very useful. This allows you to have some control over (and often reduce) the granularity of serialization boundaries.

The reason I introduced the actor context protocols is to allow Never to be used as an ActorContext that means “no shared context, each instance has its own queue”. If we didn’t need to do that (or if we had != constraints) we wouldn’t need the context protocols.

The programming model is not that different from yours in terms of conceptual or syntactic weight:

@globalActor
struct MyGlobalActor { }

@MyGlobalActor
actor class MyActor {}

vs

struct MyGlobalActorContext: GlobalActorContext { }

actor class MyActor: GlobalActor {
    typealias ActorContext = MyGlobalActorContext
}

It’s slightly more verbose, but has all of the advantages that come with working within the existing language and type system. I think magic attributes make sense in some cases, but not when a protocol-based design is both possible and useful without imposing a meaningful burden on less experienced programmers (i.e. without violating the principle of progressive disclosure).

2 Likes

I find this idea to be very interesting. We certainly do need to allow async functions on actors, because you need to be able to interact with the asynchronous world. However, letting synchronous actor functions be called as async on non-self instances lets us maintain the model's consistency without introducing lots of sync/async overloading or pushing code to async that doesn't need it.

Doug

9 Likes

@Lantua mentioned this as well. I like the idea.

You won't quite get here. There are no async properties, so you can't asynchronously access a var within an actor.

You can define @actorIndependent computed properties that can be used from outside. They're restricted to not touch mutable state in the actor, but it's a reasonable evolution path if you started with a let.

No need to be dramatic. The proposal is completely up-front about the tradeoffs being made by the design and its authors are open to discussion on the right design and tradeoffs.

Any solution to the reference-type problem would have to deal with this. The actorlocal notion mentioned in the road map is an aspect of the type; a cross-actor reference to a let whose type is actorlocal would be ill-formed (because, you know, it's local to that actor).

I can see why this is a problem for your ActorSendable design, because let access is synchronous and there's no point where you can safely do the implicit copy.

Whether the potential for trouble in the ActorSendable design makes let access a bad idea or not, I'm not sure: I'd like to see how more of the design shakes out.

You've conflated the two features here to draw a fairly strong conclusion. We've talked about let above; if you have concerns about @actorIndependent, it would be best for you to convey those directly and also consider the use cases that @actorIndependent fulfills.

I'm not opposed to this. We're trying to break things up into digestible proposals without having so many little proposals running around that folks can't keep track.

This is what global actors are for, but as you noted earlier...

As noted above, you can't simply "flip the behavior" here. I recommend you consider how to conform an actor class to CustomStringConvertible once you've taken away the let behavior and @actorIndependent.

Inheriting from NSObject is currently the only way to get conformance to NSObjectProtocol. You also need to inherit from NSObject (directly or indirectly) to mark a class as @objc. I looked at other options, and accepting this NSObject-shaped inheritance wart seems like the least bad option.

Yes, or we should add the notion of a "concurrent, non-escaping" function type to the mix. We had a thread on this somewhere, but I can't find it at the moment. Short version: we could add "concurrent" to function types and it would provide some benefits here over always falling back to "escaping", but it also complicates the type system. Tradeoffs.

My thinking here was that one could take a class that might be hard to turn into an actor, e.g., because it's enmeshed with a non-actor class hierarchy, and conform to the Actor protocol. We could then say that calls to async functions on such a class would get scheduled on the actor, so you're getting the hop-to-queue behavior for async calls (== less boilerplate) for free, but not the advantages of enforced data isolation.

That said, this is all very fuzzy and I'm not at all convinced that this kind of half-actor class is going to useful in practice.

I had not expected this view point at all, but this explains your comments about @actorIndependent. Not being able to allow actors to conform to existing, synchronous protocols at all seems like it would make actors hard to integrate into existing Swift programs. I mentioned CustomStringConvertible, but Identifiable and comes to mind as being important, and a distributed actor system would sure like to make Codable work.

Thanks for the feedback.

Doug

12 Likes

You mean the ones started here?

+1 (as in, it doesn't sound very useful in practice).

1 Like