Pitch: Protocol-based Actor Isolation

George · November 2, 2020, 5:38pm

One thing that I'd like to dig deeper on is about is the signature of unsafeActorSendable. The way I see it, sending a value from one actor to another is not exactly the same thing as sending it to a different thread. Indeed, If we do right thing from a performance perspective, many or most "sends" should end up scheduled on the same thread. The way this protocol is designed, we would have to be defensive and treat any value sent across actor contexts as though it will be accessed from a different thread, and I'm not sure whether or not this is something that can be optimized away in the compiler.

I'm not sure what a good solution to this problem is, but something like unsafeSendToActor(in context: Context) which would allow types to make a decision about how defensive to be may be preferable. I haven't yet come up with a concrete idea of what exactly Context is and how it would be used.

tali · November 2, 2020, 5:46pm

In Adoption of ActorSendable by Standard Library Types, the following example is given:

extension Array : ActorSendable where Element : ActorSendable {
  func unsafeSendToActor() -> Self { self }
}

Is this correct? Isn't reusing the complete array only safe when its elements have value semantic?
For ActorSendable Types with a custom unsafeSendToActor() implementation, we'd have to call that for every element, right?

extension Array : ValueSemantic where Element : ValueSemantic {}
extension Array : ActorSendable where Element : ActorSendable {
  func unsafeSendToActor() -> Self { self.map { $0.unsafeSendToActor() } }
}

Chris_Lattner3 · November 2, 2020, 5:49pm

Hi George,

I'm sorry, but I don't know what you're implying here. Actors and "threads" are a related-but-different abstraction. The way this works is that the "send" is just part of the caller side responsibility. I can make this more clear in the writing.

This is a very good catch, and you're absolutely right. I will update the proposal, thank you for pointing this out!

-Chris

michelf · November 2, 2020, 6:00pm

I think @George’s point is that if two actors are bound to the same serial queue, there might be no need for defensive copying. Knowing about this somehow in unsafeActorSendable would allow it to skip the copy.

AlexanderM · November 2, 2020, 6:12pm

I think what he's getting at I (correct me if I'm wrong George) is that potentially lots of "value sends” between values will occur between two actors with a shared synchronization context, e.g. on the same thread. In such a context, the copying operations performed by a send might be possible to safely elide.

I don’t know what component should be responsible for making that decision, whether it be the actor, the executor, statically decided by the compiler, or some other component.

Imagine if each send was associated with a Context, which knows about the source and destination actor. If they’re executed in a different synchronization context (e.g. different threads), then the full “send” operation will be done, without ever that entails (potentially an expensive copy). If they’re the same, the send could be replaced with an alternative code path which does a cheap passing of the value,

anandabits · November 2, 2020, 6:16pm

It depends on the definition of “pure”. If NSMutableString‘s deep copying implementation was considered “pure” (because heap allocation is allowed in a “pure” operation) then this definition would not be robust enough. And if that wasn’t allowed then operations like Array.append which may need to allocate a new buffer would not be considered “pure”. So I suspect this definition isn’t robust enough to capture the notion most people have in mind when thinking of what a “value semantic type” is.

Joe_Groff · November 2, 2020, 6:22pm

NSMutableString's implementation isn't pure, because it accesses shared mutable state. Array.append is pure, because it only modifies its value in-place, and the memory allocation inside Array's implementation is mostly behind the value type abstraction. By "pure" I'm talking about an operation-level notion of value semantics; it sounds like the concept we're reaching for here is that concept applied specifically to the actor send operation.

anandabits · November 2, 2020, 6:26pm

Ahh, of course. The deep copy is read-only access but it still reads shared mutable state. Maybe this is a reasonable way to formalize the notion within Swift. If so, we would want to be able to refine the requirement like this (using strawman syntax for purity):

protocol ValueSemantic: ActorSendable {
   func unsafeSendToActor() pure -> Self
}

Karl · November 2, 2020, 6:38pm

I recently had a need for a custom copy constructor and destructor for a struct, and by happy coincidence, it seems to me that ActorSendable really is just all about allowing a type to decide how it is copied to memory isolated by a different actor. Maybe this could be part of a more general API which gives us more control over what happens when values/objects are copied and destroyed. It's worth noting that C++ interop will bring structs with user-provided copy constructors and destructors anyway.

As for ValueSemantics - I'm broadly in favour, depending on exactly how it is defined. Most generic code just assumes that everything has value semantics, and the community has been asking for some kind of generic constraint to formalise that literally for years. The idea that Swift allows both value- and object- orientated programming is something of a lie in that respect; lots of "generic" code is trivial to break once classes get thrown in to the mix.

Joe_Groff · November 2, 2020, 6:47pm

I think that ActorSendable is still something we want to be separable from value copying and destroying, since it is useful to be able to send object graphs without forcing those objects to be value types, which isn't always convenient or possible today.

A protocol can't fix that, because the properties of value semantics are not inherent to types, but to operations. A ValueSemantic protocol would only give a false sense of security. Value semantics needs to be modeled at the function level to get the properties people want when they use value types.

Chris_Lattner3 · November 2, 2020, 7:12pm

Ok, I think I see what you mean.

I think this discussion is a much larger issue though - you need a "value send" to happen between any two domains you need isolated from each other. Two actors running on the same thread now doesn't mean they will be on the same thread later - this abstraction over the kernel thread interface is a key aspect of actor designs.

If we were to do something like this, it would be something like an "actor group", where some group of actors shared the same queue under the covers. Such a thing is possible to do, but would vastly complicated the design, and I'm not sure what good it would provide. We can already have one actor have multiple objects within its domain, so why not just use an actor as the 'actor group'?

Agreed.

I don't follow what you mean here - what does it mean for value semantics to be modeled at the function level?

-Chris

AlexanderM · November 2, 2020, 7:20pm

I think part of what's muddying the water is that I (and a few others I've seen) haven't gotten a clear sense of the scale of an actor. That would inform how often actor-to-actor messaging happens

Someone asked this (but I can't find it now): how many actors do we expect in a typical system? What kinds of things do they model?

Are we talking many thousands of actors like on an Erlang/Elixir system, or a just a few, modelling key concurrent business processes?

E.g. Is there one actor per potentially-contested variable? Is there one actor for every player connected to a multi player game server? One actor per sharable game entity?

anandabits · November 2, 2020, 7:23pm

How is an "actor group" different from the "global actor" support in the proposed design? Isn't this exactly a way to have a group of actors share the same queue? I gave some examples of use cases in the actor thread.

One reason I can imagine multiple objects is not necessarily sufficient is that you might have a layered library design where the a lower layer defines an actor and you also want to provide a higher layer that defines an actor implemented in terms of the actor in the lower layer without requiring all interactions to be async. For example, maybe this is a persistence library and you only need one serialization context.

As I understand it, this is roughly equivalent to the notion of a pure function, but with allowance for in-place mutation of uniquely referenced data.

JJJ · November 2, 2020, 8:16pm

As I understand this, it's all implemented by passing values around. It's just a matter of how we use those values. E.g. a pointer is a value, but it may be used to reference a range of other values. And how we use those values are decided at the function level, not the type level. E.g. I suspect it's possible to add a single method to Array that makes it a reference type instead of a value type.

Joe_Groff · November 2, 2020, 10:17pm

As food for thought, what if Codable were the ActorSendable protocol? It already exists, it can be used today to clone an independent object graph from a conforming type by doing a decode(encode(x)) round trip, and it could be leveraged in the future to support distributed actors. The performance of an encoding round trip would of course not be great for local shared-memory actor communication, but we could conceivably add unsafeSendToActor as a new requirement to Codable, with default implementations that return self for value types where nothing outside the value is encoded, use decode(encode(x)) for existing binaries, and teach the compiler how to generate a more efficient object graph copy for other types.

The way I see it, whether something is a "value type" is ultimately up to how you use it. Adding two Ints has value semantics, but indexing a global mutable Array with an Int does not, even though there are only "value types" involved. I'm suspicious of any attempts to model value semantics as a type-by-type protocol because types alone aren't sufficient to get the properties we associate with value semantics.

Jumhyn · November 2, 2020, 10:31pm

This was on my mind as well, but I think the semantic guarantees of Codable are not quite strong enough for what we'd want ActorSendable to do. All Codable really tells us is that a type "can convert itself into and out of an external representation," but the behavior of that type after conversion to/from the external representation is left completely unspecified.

It would be perfectly valid to conform to Codable a class which is intended for use as a singleton and reads/writes lots of global data, on the assumption that Codable is only used for persistence between program launches. However, such a conformance is obviously unsuitable for ActorSendable, since two separate instances would be referencing the same global data.

ktoso · November 3, 2020, 1:23am

I don’t think that’s necessary or desirable to add a group concept like this.

The need itself is valid and real. How it can be addressed is by special executors, specifically: each actor is bound to an executor, there can be various executors. We can build “SingleThreadExecutor” or something like this, and pass it to multiple actors on construction — it is then known that they all share the same “real” thread and it can be checked on dispatch because recipient.executor.shouldEnqueueOrExecuteDirectly(from: self.executor) (silly pseudo code, you get the idea).

It’s an extension of the idea of: if we’re calling an async function on “the same actor as I am” there’s no need to enqueue but we can keep running that function — but extending it to awareness of “do I need to hop or not”.

Such mechanism would also enable Swift NIO to express itself in form of actors I believe — since there are many ChannelHandlers which may want to be actors, but they often invoke eachother but they are all guaratneed to be on the same event loop (which is exactly 1 thread).

So grouping actors together based on how they execute I would not recommend, we should rather attempt to solve this by executor semantics I beleive.

--

Small side-note: it is not unheard of to break the "actor can run anywhere" illusion and actually peek into them for optimization's sake, so I would not be nervous about such things.

xwu · November 3, 2020, 3:08am

But could we achieve something similar but more semantically sound by having protocol Sendable : Codable, with a default implementation?

Jumhyn · November 3, 2020, 3:15am

Could you clarify exactly what you're suggesting? As far as I can see it would be sound to have some sort of ActorSendableByCodable: Codable with an appropriate default implementation for unsafeSendToActor, but I don't think it makes sense to have ActorSendable be a refinement of Codable itself. I don't see any reason why actor-sendability implies serializability.

xwu · November 3, 2020, 3:43am

Well, you mention that the semantic guarantees of Codable aren't strong enough; I guess you're saying here that those guarantees are actually neither sufficient nor necessary. Hmm.