Pitch: Protocol-based Actor Isolation

Dave, the problem with you is that when you explain things clearly, you make me realize that I'm hopelessly confused. :slight_smile: :slight_smile: You're right again, and I'll incorporate this into the proposal, thanks!

-Chris

8 Likes

I think precisely specifying the semantics expected of a manual conformance to ValueSemantic is the most important part of this protocol. Should discussion of that topic be considered in scope for this thread?

2 Likes

I've started a new thread for that.

This includes generic structs, as well as its core collections: for example, Dictionary<Int, String> can be directly shared across actor boundaries.

I was skeptical of this, because it looks like the implementation of CoW within dictionary uses an unsynchronized check-then-act pattern. I'm trying to craft a race-condition scenario, but I failed to do so:

  1. Actor A has a public let dict: Dictionary<Int, String>, uniquely owns its backing storage (ref_count = 1)
  2. Actor A begins a mutating operation on the dict.
  3. Actor A calls isUniquelyReferenced (somewhere inside of the implementation of the mutating method). A determination is made that isUnique = true.
  4. Actor B accesses a.dict
    1. This increases the backing storage's ref count to 2
  5. All mutation operations B attempts will lead to a copy because the ref count is 2.
  6. Actor A continues down its code-path for in-place modification (since isUnique was true)
  7. Everything works out fine?

I suppose that this lock-less check-than-act is safe, because by the time you have the ability to call isUniquelyReferenced, you already have a newly-retained owning reference which pulls you out of the contented ref_count = 1 scenario.

Hypothetically, this can race in a situation where you have two references to the same back storage, but a retain count of 1. Obviously this should never happen.

Is this correct?

I don't think such a protocol exists, because the properties we associate with "value semantics" generally arise from operations, not the types themselves. But I don't think that such a protocol is necessary for the subject at hand—if we're talking about inter-actor communication, there is in a specific operation whose properties we're interested in, sending a value between actors. The question is similar to Codable in that it comes down to how much data comes along for the ride; for what we think of as "value types", it's just the value itself, but classes can also conform to Codable and serialize related object graphs, so that you can create a fresh orthogonal object graph somewhere else.

1 Like

It's fine with me if we use different terminology, but as Chris pointed out a refining protocol with semantics that support a trivial default implementation of unsafeSendToActor is worthwhile. I posted more detailed thoughts in the ValueSemantic thread.

1 Like

One way to think about it might be that a "value type" is one where Codable/Sendable is a pure operation.

That’s exactly right, yes. That is why checking for a unique reference doesn’t have to be part of some complicated atomic scheme: to get a race, you either have to have two threads mutating the same reference (and thus a higher-level race that you can’t “fix”) or two different references (in which case at least one of them should observe a reference count greater than one).

The lack of an atomic sequence does mean that it’s unreliable whether any thread observes a unique reference. In an ideal world, you’d like one thread to see a unique reference and just modify the object in-place. But to do that, you’d need these checks to be much more expensive, and you’d have to serialize all make-unique operations; it’s not the right trade-off to make.

1 Like

One thing that I'd like to dig deeper on is about is the signature of unsafeActorSendable. The way I see it, sending a value from one actor to another is not exactly the same thing as sending it to a different thread. Indeed, If we do right thing from a performance perspective, many or most "sends" should end up scheduled on the same thread. The way this protocol is designed, we would have to be defensive and treat any value sent across actor contexts as though it will be accessed from a different thread, and I'm not sure whether or not this is something that can be optimized away in the compiler.

I'm not sure what a good solution to this problem is, but something like unsafeSendToActor(in context: Context) which would allow types to make a decision about how defensive to be may be preferable. I haven't yet come up with a concrete idea of what exactly Context is and how it would be used.

3 Likes

In Adoption of ActorSendable by Standard Library Types, the following example is given:

extension Array : ActorSendable where Element : ActorSendable {
  func unsafeSendToActor() -> Self { self }
}

Is this correct? Isn't reusing the complete array only safe when its elements have value semantic?
For ActorSendable Types with a custom unsafeSendToActor() implementation, we'd have to call that for every element, right?

extension Array : ValueSemantic where Element : ValueSemantic {}
extension Array : ActorSendable where Element : ActorSendable {
  func unsafeSendToActor() -> Self { self.map { $0.unsafeSendToActor() } }
}
1 Like

Hi George,

I'm sorry, but I don't know what you're implying here. Actors and "threads" are a related-but-different abstraction. The way this works is that the "send" is just part of the caller side responsibility. I can make this more clear in the writing.

This is a very good catch, and you're absolutely right. I will update the proposal, thank you for pointing this out!

-Chris

1 Like

I think @George’s point is that if two actors are bound to the same serial queue, there might be no need for defensive copying. Knowing about this somehow in unsafeActorSendable would allow it to skip the copy.

I think what he's getting at I (correct me if I'm wrong George) is that potentially lots of "value sends” between values will occur between two actors with a shared synchronization context, e.g. on the same thread. In such a context, the copying operations performed by a send might be possible to safely elide.

I don’t know what component should be responsible for making that decision, whether it be the actor, the executor, statically decided by the compiler, or some other component.

Imagine if each send was associated with a Context, which knows about the source and destination actor. If they’re executed in a different synchronization context (e.g. different threads), then the full “send” operation will be done, without ever that entails (potentially an expensive copy). If they’re the same, the send could be replaced with an alternative code path which does a cheap passing of the value,

It depends on the definition of “pure”. If NSMutableString‘s deep copying implementation was considered “pure” (because heap allocation is allowed in a “pure” operation) then this definition would not be robust enough. And if that wasn’t allowed then operations like Array.append which may need to allocate a new buffer would not be considered “pure”. So I suspect this definition isn’t robust enough to capture the notion most people have in mind when thinking of what a “value semantic type” is.

NSMutableString's implementation isn't pure, because it accesses shared mutable state. Array.append is pure, because it only modifies its value in-place, and the memory allocation inside Array's implementation is mostly behind the value type abstraction. By "pure" I'm talking about an operation-level notion of value semantics; it sounds like the concept we're reaching for here is that concept applied specifically to the actor send operation.

Ahh, of course. The deep copy is read-only access but it still reads shared mutable state. Maybe this is a reasonable way to formalize the notion within Swift. If so, we would want to be able to refine the requirement like this (using strawman syntax for purity):

protocol ValueSemantic: ActorSendable {
   func unsafeSendToActor() pure -> Self
}
3 Likes

I recently had a need for a custom copy constructor and destructor for a struct, and by happy coincidence, it seems to me that ActorSendable really is just all about allowing a type to decide how it is copied to memory isolated by a different actor. Maybe this could be part of a more general API which gives us more control over what happens when values/objects are copied and destroyed. It's worth noting that C++ interop will bring structs with user-provided copy constructors and destructors anyway.

As for ValueSemantics - I'm broadly in favour, depending on exactly how it is defined. Most generic code just assumes that everything has value semantics, and the community has been asking for some kind of generic constraint to formalise that literally for years. The idea that Swift allows both value- and object- orientated programming is something of a lie in that respect; lots of "generic" code is trivial to break once classes get thrown in to the mix.

3 Likes

I think that ActorSendable is still something we want to be separable from value copying and destroying, since it is useful to be able to send object graphs without forcing those objects to be value types, which isn't always convenient or possible today.

A protocol can't fix that, because the properties of value semantics are not inherent to types, but to operations. A ValueSemantic protocol would only give a false sense of security. Value semantics needs to be modeled at the function level to get the properties people want when they use value types.

4 Likes

Ok, I think I see what you mean.

I think this discussion is a much larger issue though - you need a "value send" to happen between any two domains you need isolated from each other. Two actors running on the same thread now doesn't mean they will be on the same thread later - this abstraction over the kernel thread interface is a key aspect of actor designs.

If we were to do something like this, it would be something like an "actor group", where some group of actors shared the same queue under the covers. Such a thing is possible to do, but would vastly complicated the design, and I'm not sure what good it would provide. We can already have one actor have multiple objects within its domain, so why not just use an actor as the 'actor group'?

Agreed.

I don't follow what you mean here - what does it mean for value semantics to be modeled at the function level?

-Chris

1 Like

I think part of what's muddying the water is that I (and a few others I've seen) haven't gotten a clear sense of the scale of an actor. That would inform how often actor-to-actor messaging happens

Someone asked this (but I can't find it now): how many actors do we expect in a typical system? What kinds of things do they model?

Are we talking many thousands of actors like on an Erlang/Elixir system, or a just a few, modelling key concurrent business processes?

E.g. Is there one actor per potentially-contested variable? Is there one actor for every player connected to a multi player game server? One actor per sharable game entity?

1 Like
Terms of Service

Privacy Policy

Cookie Policy