Pitch: Protocol-based Actor Isolation

Ahh, of course. The deep copy is read-only access but it still reads shared mutable state. Maybe this is a reasonable way to formalize the notion within Swift. If so, we would want to be able to refine the requirement like this (using strawman syntax for purity):

protocol ValueSemantic: ActorSendable {
   func unsafeSendToActor() pure -> Self
}
3 Likes

I recently had a need for a custom copy constructor and destructor for a struct, and by happy coincidence, it seems to me that ActorSendable really is just all about allowing a type to decide how it is copied to memory isolated by a different actor. Maybe this could be part of a more general API which gives us more control over what happens when values/objects are copied and destroyed. It's worth noting that C++ interop will bring structs with user-provided copy constructors and destructors anyway.

As for ValueSemantics - I'm broadly in favour, depending on exactly how it is defined. Most generic code just assumes that everything has value semantics, and the community has been asking for some kind of generic constraint to formalise that literally for years. The idea that Swift allows both value- and object- orientated programming is something of a lie in that respect; lots of "generic" code is trivial to break once classes get thrown in to the mix.

3 Likes

I think that ActorSendable is still something we want to be separable from value copying and destroying, since it is useful to be able to send object graphs without forcing those objects to be value types, which isn't always convenient or possible today.

A protocol can't fix that, because the properties of value semantics are not inherent to types, but to operations. A ValueSemantic protocol would only give a false sense of security. Value semantics needs to be modeled at the function level to get the properties people want when they use value types.

4 Likes

Ok, I think I see what you mean.

I think this discussion is a much larger issue though - you need a "value send" to happen between any two domains you need isolated from each other. Two actors running on the same thread now doesn't mean they will be on the same thread later - this abstraction over the kernel thread interface is a key aspect of actor designs.

If we were to do something like this, it would be something like an "actor group", where some group of actors shared the same queue under the covers. Such a thing is possible to do, but would vastly complicated the design, and I'm not sure what good it would provide. We can already have one actor have multiple objects within its domain, so why not just use an actor as the 'actor group'?

Agreed.

I don't follow what you mean here - what does it mean for value semantics to be modeled at the function level?

-Chris

1 Like

I think part of what's muddying the water is that I (and a few others I've seen) haven't gotten a clear sense of the scale of an actor. That would inform how often actor-to-actor messaging happens

Someone asked this (but I can't find it now): how many actors do we expect in a typical system? What kinds of things do they model?

Are we talking many thousands of actors like on an Erlang/Elixir system, or a just a few, modelling key concurrent business processes?

E.g. Is there one actor per potentially-contested variable? Is there one actor for every player connected to a multi player game server? One actor per sharable game entity?

1 Like

How is an "actor group" different from the "global actor" support in the proposed design? Isn't this exactly a way to have a group of actors share the same queue? I gave some examples of use cases in the actor thread.

One reason I can imagine multiple objects is not necessarily sufficient is that you might have a layered library design where the a lower layer defines an actor and you also want to provide a higher layer that defines an actor implemented in terms of the actor in the lower layer without requiring all interactions to be async. For example, maybe this is a persistence library and you only need one serialization context.

As I understand it, this is roughly equivalent to the notion of a pure function, but with allowance for in-place mutation of uniquely referenced data.

As I understand this, it's all implemented by passing values around. It's just a matter of how we use those values. E.g. a pointer is a value, but it may be used to reference a range of other values. And how we use those values are decided at the function level, not the type level. E.g. I suspect it's possible to add a single method to Array that makes it a reference type instead of a value type.

As food for thought, what if Codable were the ActorSendable protocol? It already exists, it can be used today to clone an independent object graph from a conforming type by doing a decode(encode(x)) round trip, and it could be leveraged in the future to support distributed actors. The performance of an encoding round trip would of course not be great for local shared-memory actor communication, but we could conceivably add unsafeSendToActor as a new requirement to Codable, with default implementations that return self for value types where nothing outside the value is encoded, use decode(encode(x)) for existing binaries, and teach the compiler how to generate a more efficient object graph copy for other types.

The way I see it, whether something is a "value type" is ultimately up to how you use it. Adding two Ints has value semantics, but indexing a global mutable Array with an Int does not, even though there are only "value types" involved. I'm suspicious of any attempts to model value semantics as a type-by-type protocol because types alone aren't sufficient to get the properties we associate with value semantics.

6 Likes

This was on my mind as well, but I think the semantic guarantees of Codable are not quite strong enough for what we'd want ActorSendable to do. All Codable really tells us is that a type "can convert itself into and out of an external representation," but the behavior of that type after conversion to/from the external representation is left completely unspecified.

It would be perfectly valid to conform to Codable a class which is intended for use as a singleton and reads/writes lots of global data, on the assumption that Codable is only used for persistence between program launches. However, such a conformance is obviously unsuitable for ActorSendable, since two separate instances would be referencing the same global data.

I don’t think that’s necessary or desirable to add a group concept like this.

The need itself is valid and real. How it can be addressed is by special executors, specifically: each actor is bound to an executor, there can be various executors. We can build “SingleThreadExecutor” or something like this, and pass it to multiple actors on construction — it is then known that they all share the same “real” thread and it can be checked on dispatch because recipient.executor.shouldEnqueueOrExecuteDirectly(from: self.executor) (silly pseudo code, you get the idea).

It’s an extension of the idea of: if we’re calling an async function on “the same actor as I am” there’s no need to enqueue but we can keep running that function — but extending it to awareness of “do I need to hop or not”.

Such mechanism would also enable Swift NIO to express itself in form of actors I believe — since there are many ChannelHandlers which may want to be actors, but they often invoke eachother but they are all guaratneed to be on the same event loop (which is exactly 1 thread).

So grouping actors together based on how they execute I would not recommend, we should rather attempt to solve this by executor semantics I beleive.

--

Small side-note: it is not unheard of to break the "actor can run anywhere" illusion and actually peek into them for optimization's sake, so I would not be nervous about such things.

But could we achieve something similar but more semantically sound by having protocol Sendable : Codable, with a default implementation?

1 Like

Could you clarify exactly what you're suggesting? As far as I can see it would be sound to have some sort of ActorSendableByCodable: Codable with an appropriate default implementation for unsafeSendToActor, but I don't think it makes sense to have ActorSendable be a refinement of Codable itself. I don't see any reason why actor-sendability implies serializability.

Well, you mention that the semantic guarantees of Codable aren't strong enough; I guess you're saying here that those guarantees are actually neither sufficient nor necessary. Hmm.

1 Like

I thought about this some more, and while having some indication that a value is "safe" to send across actor contexts is desirable, I'm not convinced there is a single solution that will work for the great heterogeny of actors I expect for Swift.

One only-slightly-contrived example that comes to mind is actor A sending a reference to actor B that is unsafe for actor B to use except to provide that reference back to actor A. Would this reference or the type containing it be ActorSendable? Additionally, some actors will have stricter requirements as to what kind of values can be sent to them. For instance, an actor meant to run on a TPU may only accept values representable in that context. All this leads me to believe that it is the role of the actor to make sure the arguments and return values in its API can safely be sent to or from its actor context. An ActorSendable protocol may thus create a false sense of security, leading to confusion.

That said, there may be many patterns which occur quite frequently in practice that require some operation to be performed on all values being sent to or from an actor context. We may be able to do something clever here to minimize the amount of boilerplate. A simple solution that comes to mind is to create a "function wrapper" type which actors can process arguments and return values either across the entire actor or just particular functions. You could imagine such wrappers being created that roundtrip through Codable, or perform deep copies of Objective-C data structures.

In summary, for me it feels like the goal of defining what it means to be "safe to send across actor contexts in general" is intractable, but there are things we can do to make the creation of specific classes of actors easier and less error prone.

A principled approach to this would be to wrap the reference in a struct, where the wrapped value is not accessible to actor B. I believe this would be ActorSendable and meet most the various proposed definitions of ValueSemantic. This encapsulation also frees B from knowing about the representation of the token, and should have zero runtime overhead.

If you expect this pattern to be common you might consider that too much ceremony, but I don’t think it’s obviously onerous enough to make ActorSendable not worth it.

I don’t see how this is a problem at all; actor methods will still be typed.

Thanks for the feedback, a couple of random updates and comments:

Just to clarify the discussion about "actor groups" above, I am not in favor of introducing another abstraction here. I think that actors by themselves are enough.

I still think it makes sense for ActorSendable to be distinct from Codable, because certain things are actor sendable but may not be codable, e.g. an internally synchronized reference type. However, it could make sense to make ActorSendable, Codable, and ValueSemantic refine each other. I think it would be best to discuss this in the ValueSemantic thread.


Also, I forgot to mention how closures / function types fit into this. They are a bit weird because they aren't nominal types, but we should be able to support them. I added a section to future work to describe the approach I recommend. I welcome comments and thoughts as always.

-Chris

For what it's worth, I have wanted something like your design for Equatable and Hashable in the past as well. If we're going to start blessing function types with protocol conformances based on their captures. It would be great if this approach could be extended to cover those use cases (even if as a future enhancement).

1 Like

Equality and hashable are a bit trickier for function types, because you have to decide what "equality" means for code. Is it pointer equality of the code address? What about thunks? What about duplicated code, what about merged code that happens to be identical, etc. This is one of the reasons that we didn't want closures to be comparable in general.

-Chris

2 Likes

I almost commented on that. The behavior I have always wanted is identity based on source code. Thunks should be transparent to the programming model. If a programmer happens to copy code and have two source-identical functions I would not expect those to compare equal despite having identical behavior.

1 Like

There was some previous discussion of this in the thread about closures as literals for anonymous structs. For things like SwiftUI, we'd want some closures to be equatable/hashable by code identity plus the equality of their contexts. This is consistent with how key path equality works too, which is based on declaration identity; every closure literal in source is effectively a different declaration.

1 Like
Terms of Service

Privacy Policy

Cookie Policy