[Concurrency] Actors & actor isolation

Chris_Lattner3 · November 5, 2020, 7:18pm

Hi Doug,

I think this is a fairly fundamental difference in world view. I'll lay out what I'm suggesting:

As proposed: Actors can have sync and async methods, and properties/subscripts (really, their getters and setters) are always sync. Intra actor calls can use both directly. Also, inout uses across actor boundaries are not allowed.
Unlike what is proposed: cross actor references obey access control like normal, both for methods and properties. All cross actor references are gated by ActorSendable check on parameters and results. No special case for let.
Unlike what is proposed: Any reference to a cross-actor sync functions is allowed (subject to standard access control and ActorSendable check), but is implicitly an async reference and needs to be awaited.

The payoff of this is several fold:

Our complicated access control model doesn't get any more complicated.
You don't need wrappers for sync functions and introduce naming problems as mentioned above.
All properties/subscripts are directly usable across actor boundaries, since their getter/setter is just a sync function that returns/takes a T and is treated like any other sync function.

This is a simple and consistent model. It doesn't require special cases for let's and doesn't complicate access control. It doesn't change our resilience model with lets. It defines away most of the need for @actorIndependent.

Incidentally, I find @actorIndependent to be highly dubious - why would these things be defined on the actor instances if they are independent of it? Isn't this just a static actor member?

-Chris

Joe_Groff · November 5, 2020, 7:32pm

@actorIndependent operations can still read immutable state. Actor instances can have let properties.

Chris_Lattner3 · November 5, 2020, 8:13pm

That's my point, with the model shift I describe above, I don't see why this is very important. Recall that public let properties will be directly accessible with an await.

If the goal is to achieve a performance optimization (avoiding a queue hop) then that can be handled according to the as-if rule at the SIL level. If the goal is to syntax optimize away the 'await', then I would recommend punting this attribute out of the basic model as "not critical" syntactic sugar and reconsider it later when the model as a whole settles and we have experience to know that it is justified.

-Chris

Joe_Groff · November 5, 2020, 10:37pm

Maybe it's less important. The difference between async or not could be significant if the member is trying to satisfy a synchronous protocol requirement, or be captured in a synchronous closure.

ktoso · November 6, 2020, 1:51am

To make this a bit more specific; What are the specific alternative semantics are being proposed that will allow us to implement an actor's description without going async onto the actor's queue only to print it's name (or address, or anything that we can use to uniquely identify it "for humans")?

actor class Greeter: CustomStringConvertible { 
  let name: String
  init(name: String) { self.name = name }

  @actorIndependent // can't access non-constant or actorlocal state; good.
  var description: String { "Greeter(name: \(name))" }
}

to allow:

// what actor do I even have in hand here?
print("I have: \(greeter)")

// which is really equal to 
// greeter.name

I have worked with "let's pretend Codable solves everything" prototypes for a long time and this solves a few of the issues, but not the big pain points. So the Sendable discussion I see more of an "1.5" world, rather than an alternative proposal really to be honest... (and this may be viable, but I don't think it's a full alternative).

I'm on board with the "allow sync functions to be called on actor, yet they become async implicitly" -- this is fine and not too worrying. Doing the same for properties is quite weird to be honest -- an actor's purpose in life is to isolate state, not make it trivial to call into it (the let special case in the current proposal is weird, but it makes a bit more sense; I'd be nervous about random vars being allowed to be exposed like that -- it incentivizes the wrong programming model -- actors are not just "slap an actor around it and keep mutating stuff on it" but they benefit from expressing tasks as "do these things please" and sending this to the actor, rather than allowing to easily keep mutating the 10 vars the actor may have internally).

Lantua · November 6, 2020, 2:07am

I did have a similar sentiment. If you're updating the user's profile card, you'd never want to do this:

self.name = async reloadName()
self.image = async reloadImage()

It is particularly problematic because it's very natural to reach for, but the updating of name and image are on different partial tasks, which is not what we want. I did say in an old thread that async function should always be actor-independent.

That said, it's not great either to have async always be actor-independent. We'd need to have an actor-guarantee block to combine sync functions from the same actor together

func foo() async {
  await asyncFunction1()

  await(ActorX) {
    syncFunction1()
    syncFunction2()
  }

  await asyncFunction2()
}

which limits how variables are declared. You can't declare a shared variable inside the await block (or else it won't be visible for subsequent blocks).

This pitch seems to acknowledge this problem of unintentional partial-task splitting and suggest that having await should be enough of a signal for that. I'm not sure yet what I think about that.

tclementdev · November 6, 2020, 10:13am

Right. Actors encourage and make it easy to have a lot of suspension points in a program, which seems really dangerous because it has the potential to introduce a lot of subtle ordering bugs. Won't it make programs more unpredictable and difficult to reason about?

There's a lot of unanswered questions. Is the developer able to realize whether partial-task splitting would happen? Wouldn't it be hard to say for sure that it won't? If the develop thinks it's a problem, how does he go about fixing it?

frameworklabs · November 6, 2020, 11:10am

To be fair, you have the same issues resulting from interleaved “tasks” with the continuation passing style today. One advantage of async/await is that it makes this fact more obvious.

Lantua · November 6, 2020, 11:10am

To be clear, I'm not talking about actor specifically, but top-level code in async function in general.

async is the only place one can split task, similar to how try is the only place that can throw. So I don't think it's unexpected of Swift to notifying the reader with just that. I'm sure anyone could learn quickly that await splits the partial tasks if we only so much as to say "well, that's how it is".

What's uncertain for me is whether they would internalize it, and naturally reach for the right syntax when writing, including splitting the partial tasks properly.

TBH, I'm not completely comfortable with the part that we have actor-isolated async functions, but I fail to come up with any better rule.

tclementdev · November 6, 2020, 11:28am

Yes. But my point is all of this makes it easier than ever before to go massively async and we should probably wonder about the consequences when developers start adopting actors and going that way.

tclementdev · November 6, 2020, 11:31am

I agree, this is a general problem, not stricly related to actors. But actors are making a big push in that direction. My point is that we might want to be careful about introducing features that make it more likely for developers to encounter those problems.

Chris_Lattner3 · November 7, 2020, 6:19am

In the base proposal (that I'm advocating that we start with) you'd have to do an await.

I'm inferring from your question that the point of @actorIndependent is to allow actors (who are by design intended to be async creatures to clients) to conform to sync protocols. Why is this a good thing? This can only work in a terribly narrow case: again, let properties are not enough - you need let + ValueSemantic, + thread safe copyable or something. Even then, they can only work for toy examples like this where the only state they access is computed at init time.

Why add complexity to the language to solve such a trivial case? There are library level ways to handle this, e.g.:

have a struct projection that is queried async that conforms to sync protocols
introduce CustomStringConvertibleAsync protocol for this specific case
Don't allow this at all and force clients to deal with it.
Provide an unsafe thing that reaches across actor boundaries synchronously without await'ing.

All of these work with out language extensions. The @actorIndependent concept is a ton of complexity, I don't know what justifies it.

-Chris

Chris_Lattner3 · November 7, 2020, 5:02pm

ktoso:

actor class Greeter: CustomStringConvertible { 
  let name: String
  init(name: String) { self.name = name }

  @actorIndependent // can't access non-constant or actorlocal state; good.
  var description: String { "Greeter(name: \(name))" }
}

I was thinking about this last night, and protocol conformance needs to be examined much more closely. I don't understand how the proposed semantics will work in practice, and I think it would be great to expand on the description more. I don't think that solving conformance to sync methods in a way that can't touch mutable state within the actor is particularly useful.

It really isn't clear to me what the best option is here. It is important for actors to be able to conform to protocols, since protocols are the primary abstraction method over which generics and existentials work. One simple approach (probably the base model) is to say that actors can never fulfill sync requirements: allowing them to would allow existentials and generics to "escape" an internal unsychronized pointer the the actor internals, which breaks actor isolation.

Consider self-type constraints for example: how do you implement equatable between two actors? There is no way in-model to have two actors synchronously accessible at the same time.

If we went with a model where actors could never fulfill sync requirements at the base model, then we could always extend the model and consider something like @actorIndependent on its own merit. I just don't see how it is useful enough to be worth the complexity cost.

BTW, actors are classes, so you should be able to use === and we should have some library defined helpers for working with actor handles. That handle could provide a description method.

-Chris

frameworklabs · November 7, 2020, 6:14pm

What are the author’s thoughts on behaviors for actors?
With these one could define and dynamically enable subsets of the actor API.
This could e.g. help to define the allowed interleaving at suspension points.

ktoso · November 9, 2020, 1:28pm

It is not the primary use case, it is one of the applications/implications of @actorIndependent though, it is to address what you have raised as a concern in the overall roadmap thread:

Thus, @actorIndependent and it's friend @actorIndependent(unsafe) exist to address this concern and allow developers to step out of the rigid rules and use external synchronization whenever necessary.

As such, the @actorIndependent is the weaker form of this, and one might even say the less "intended use" one -- though both make sense really.

As for the typesystem aspects of it, I'll leave it up to Doug or John to comment more.

We did talk about allowing an @instantaneous hint for async functions which would not force the "await" on call sites -- this shows up a lot with the Task APIs where we just have them async because they must be called from an async context, not because they'll ever suspend -- but the conformance to sync protocols is a very different beast, it is not instantaneous; rather, it is either unsafe ("break through the model", i.e. @actorIndependent(unsafe)) or known to be safe for some reason @actorIndependent(unsafe) which yeah, just touching "thread safe state" is -- and how we define that indeed it an open hole in the model which is waiting for the "2.0"...

Personally I'd very much like a ValueSemantics protocol, it would have solved many issues, but not all of them. Again, I'll leave it to the compiler folks to argue about the viability of one model over the other though.

Trying to stick to well defined terms only; we don't have "Actor handles", I'll assume you mean a "reference to an actor".

I guess there might have to be some helpers, but coming from other implementations there really isn't all that much an actor reference needs to do – they need to provide equality (potentially serialization of references if we'd like to jump ahead a little bit) but that's about it.

We are not modeling parent/child actor hierarchies in the core model (it can be built on top), nor are we enforcing naming schemes or "props"/"settings" of actors (the "configure an actor" really being handled by overriding pieces of the Actor protocol rather than any new types to do this).

In other runtimes this is fairly logical and well understood and we should follow the same precedent imho. The current shape of the APIs makes it non obvious though because the thinking that it's "just a class" (it isn't just that, imho) clouds the perception.

Having an actor reference means that we point at some actor, somewhere, maybe it's not even on the same node, that's where the actor model actually shines, allowing modeling and understanding of distribution using the same mental model as any other communication.

As such, equality of "actors" is never the same as equality of classes or structs, or anything you "actually really have in your hands", but it really means comparing actor addresses (in a high level meaning, not pointers/addresses). Equality of actor references is then simple: does it point to the same actor or not. Perhaps it is a remote actor, yet it is possible to uniquely identify it. And such equality on actor references is tremendously useful – consider "I have a set of actors I take care of, i get notified to let one of them go away" so I need to remove it from my observed set, thus we need equality of actors in that meaning.

I'm jumping ahead here a little bit, but equality of actors makes a lot of sense and is defined exactly like this in all other runtimes I'm aware of (Akka (actor ref equality), Erlang (PID equality), and Orleans (via grain identifier equality)).

In other runtimes it is more explicit, because we talk about ActorRef<Behavior> or Grain<Behavior> or just PID rather than say "it's just like a class", which brings me to:

It may be counter productive to call them actor class in my opinion, it keeps muddying the water rather than helping clarify one's understanding that they're different beasts once one really gets to use them.

The similarities end at being "a reference rather than a value", however the semantic capabilities the two expose are very different. Beginning from what one can call on them, how they execute, and where they even are located (i.e. not necessarily in the same memory space).

They also cannot inherit from non-actors; so an actor cannot inherit from a class (it could easily violate the substitution principle and allow incorrect synchronous calls to an actor), so they really aren't all that much like classes after all, very similar, but different enough.

This is yet another reason to move towards spelling actors as actor rather than actor class, because the similarities dwindle quite quickly the more one leans into the real world usage of them and restrictions they impose. Also, actors cannot extend classes, but they could ex

I'm not so sure about usefully defining / guaranteeing === it "depends" really where the actor lives. I don't think it is valuable to === actor references other than for the odd trick to avoid a "full" ==.

ktoso · November 9, 2020, 1:59pm

I think you mean like in the good ol' actor model where "actor == mailbox + behavior" right? You are right that "becoming other behaviors" is helpful and can help avoiding code like this:

actor class Waiter {
var waiting = false

func hello() { 
  if waiting {  // repeated in every function :<
    helloWaiting()
  } else {
    helloReady() 
  }
}

I guess the specific shape of a solution to this Akka popularized with its "become(behavior)" feature. This is the core of Akka and Akka Typed, however there it is easy to model this because

a) akka classic is untyped as such you can at any point in time easily become any other "receive block" and things just work; I.e. you can easily do:

receive { context, message in
  handleThing(context.myself)
  context.become(ignoreOtherMessagesUntilWeGetAReplyFromHandleThing)
}

ignoreOtherMessagesUntilWeGetAReplyFromHandleThing = receive { ... }

so "receive" is referred to as a behavior, and they can easily swap in and out. It indeed is a very useful pattern.

b) it gets more difficult with types though; the new behavior has to become a behavior that can handle the exact same set of messages as the current one.

This exists in Akka typed and one can easily do:

// pseudo code
let initial: Behavior<MyMessage> = receive { ... in 
  switch message {
    case hello: // "ready handling"
      return waiting // return behavior to handle next message with
      // ...
    }
}

let waiting: Behavior<MyMessage> = receive { ... in 
  switch message {
    case hello: // "waiting handling"
      return waiting // return behavior to handle next message with
      // ...
    }
}

So with this style, there does not have to be an "if waiting" because the entire behavior is the "waiting".

You'll notice though... this works well when behaviors are values; implementation wise it simply means the actor has a var behavior which it runs whenever it gets a message.

How does this translate to language embedded, similar to classes, actors though which are definitely much nicer to use, but are harder to make flexible on the inside...? For what it's worth, this is a model more simlar to Orleans, which to my knowladge does not offer such "become" like feature (because it is hard to fit into the "looks like a class" model).

It is a bit difficult to express this because there isn't a "Message", it is "all my functions which are async and public, because those can be called". One could force expressing all these messages first as a protocol, and then implementing it -- then one can force implementations "you can become(behavior:)" but only the same protocol you initially had... In practice though, this ends up very weird very fast... I have played around with expressing this for a while but have not found a satisfying way to do this in the "messages look like methods" style for Actors which we are embracing.

Another way to do this though, is to push the state machine one layer below and have an

enum State { 
  case waiting
  case ready(SomeValue) 

  func hello() async -> String {
  switch self { 
  case .ready(let value): return readyHello(value)
  // ... 
  }
}

actor class Greeter: GreeterProtocol { 
  var state: State = ... 
  func hello() {
    // maybe change `state` here?
    return state.hello()
  }
}

and have all logic handling the calls inside it, forcing the switch over self each time -- it's verbose, but very easy to model explicit state machines this way. We've done this similar style with our SWIM membership implementation -- the entire logic resides in an "instance" and can be invoked by an actor (or something else, like in that repo -- just NIO handlers).

It hides away the state management a bit, and makes the state machine very explicit which is great.
But it is sadly a bit verbose, and it's not quite the same as "become".

Some style of this pattern though can be very useful if what you're modeling is nicely expressed as a state machine – indeed then handling interleaving calls is easier, we simply notice we are in the "wrong state" and handle such call appropriately.

It helps a little bit... but I'm not sure it's a real solution if we'd never allow non-reentrant actors at all. Non-reentrant actors are important and useful IMHO, but we still need to discuss this in depth (I've been working on trade-off examples and a writeup for this).

frameworklabs · November 9, 2020, 9:39pm

Thanks for your insights on this (sub) topic of actors.
Just a raw idea: maybe something similar to extensions could help to provide a scope to overload methods per behavior.
But maybe your explicit state pattern is doing the job equally well, especially given that it’s unclear how many devs would use such a behavior feature anyhow.

Regarding reentrancy into an actor - yes, that’s needed to prevent deadlocks but maybe only those methods which are called back on me when I call another actor need to be so.

phlebotinum · November 10, 2020, 1:09pm

@ktoso Just quickly wanted to say Thank You for all these detailed and well written posts/replies. Top!

Chris_Lattner3 · November 11, 2020, 12:58am

Again, I don't really understand this - it seems like a very strong violation of the actor model, which is all about async communication. Allowing sync communication which does internal local locking is not how the actor model works.

I agree that it is important to integrate types that have internal synchronization (as mentioned in the ActorSendable discussion, but they are not themselves actors. Many of these types won't want the overhead (e.g. a queue), design patterns, or APIs associated with actors. Conflating the two of these seems like a pretty strong design problem with the proposal.

Furthermore, the rationale in the proposal doesn't address this use case at all.

I think we're talking across each other. I understand you can do reference identity comparisons between actors, I meant a classical Equatable conformance (which requires looking into two actors) isn't in model, because you can't force two different actors to be synchronized at the same time (at least without going out of model).

I completely agree as I mentioned in my longer post upthread -- this should be an actor declaration. Swift already has functions and classes which have reference semantics. Clarifying this as its own kind of decl would be much cleaner.

-Chris

ktoso · November 12, 2020, 9:07am

I am very familiar with the actor model and its implementations

Sure, all this does not mean people should do this in general, but there are very narrow, specific valid use cases for it.

I think where the confusion or disagreement really originates is in our perspectives/backgrounds and what we mean with the same words, not an actual disagreement. Note also that we're, by design, not going for complete memory isolation like e.g. dart's actors/isolates do.

Your reply here made me realize this:

Sure, we're obviously agreeing here.

I'm not sure the "classical Equatable" phrasing makes sense at all here so I'm confused why it comes up. There is no "member-wise equality" for actors and there must not be.

There can however, for good reasons, be equality implementations that do not use "reference" (as in "pointer") equality, but need to implement the equality by means of some constant value that the actor contains. This means, answering the question of "Are you 'representing' the same 'resource' as that other actor or not?". To implement this, either it has to be built into the language, or enough escape hatches must exist, thus why @actorIndependent inside an actor makes total sense to me.

To explain this even more...

This is a result of Swift's specific implementation and take on the actors: it's both a "limitation"/"feature" I guess? There is ONE type, "the actor class." There is no "Reference<Actor>", onto which implementations are able to put these extra constant identifiers. One example that needs this form of equality is "proxies", in all shapes and forms. In other words, other implementations are able to have Reference<SomeActor>.someID while we don't have that type, so it must be on the actor class.

But coming back to the actual discussion point:

Sure. And perhaps this is where our discussion went off the rails? And we fixated on the weird super edge case I explained above, rather than the usage of this attribute not in actors:

@actorIndependent can also be applied to top level declarations after all -- it simply means it is "safe to access from any actor":

@actorIndependent(unsafe)
let concurrentHashMap = ...

actor Hello { 
  func hi() { concurrentHashMap.put(...) } // OK, no executor hop involved
}

This also is the same style as one would annotate something with a globalActor:

@someGlobalActor // typical example maybe "Main/UIActor" etc
let thingy = ...

actor Hello { 
  func hi() { thingy.hello(...) } // OK, potentially executor hop involved
}

Does this perhaps clarify the intended use?