Pitch: Protocol-based Actor Isolation

I thought about this some more, and while having some indication that a value is "safe" to send across actor contexts is desirable, I'm not convinced there is a single solution that will work for the great heterogeny of actors I expect for Swift.

One only-slightly-contrived example that comes to mind is actor A sending a reference to actor B that is unsafe for actor B to use except to provide that reference back to actor A. Would this reference or the type containing it be ActorSendable? Additionally, some actors will have stricter requirements as to what kind of values can be sent to them. For instance, an actor meant to run on a TPU may only accept values representable in that context. All this leads me to believe that it is the role of the actor to make sure the arguments and return values in its API can safely be sent to or from its actor context. An ActorSendable protocol may thus create a false sense of security, leading to confusion.

That said, there may be many patterns which occur quite frequently in practice that require some operation to be performed on all values being sent to or from an actor context. We may be able to do something clever here to minimize the amount of boilerplate. A simple solution that comes to mind is to create a "function wrapper" type which actors can process arguments and return values either across the entire actor or just particular functions. You could imagine such wrappers being created that roundtrip through Codable, or perform deep copies of Objective-C data structures.

In summary, for me it feels like the goal of defining what it means to be "safe to send across actor contexts in general" is intractable, but there are things we can do to make the creation of specific classes of actors easier and less error prone.

A principled approach to this would be to wrap the reference in a struct, where the wrapped value is not accessible to actor B. I believe this would be ActorSendable and meet most the various proposed definitions of ValueSemantic. This encapsulation also frees B from knowing about the representation of the token, and should have zero runtime overhead.

If you expect this pattern to be common you might consider that too much ceremony, but I don’t think it’s obviously onerous enough to make ActorSendable not worth it.

I don’t see how this is a problem at all; actor methods will still be typed.

Thanks for the feedback, a couple of random updates and comments:

Just to clarify the discussion about "actor groups" above, I am not in favor of introducing another abstraction here. I think that actors by themselves are enough.

I still think it makes sense for ActorSendable to be distinct from Codable, because certain things are actor sendable but may not be codable, e.g. an internally synchronized reference type. However, it could make sense to make ActorSendable, Codable, and ValueSemantic refine each other. I think it would be best to discuss this in the ValueSemantic thread.


Also, I forgot to mention how closures / function types fit into this. They are a bit weird because they aren't nominal types, but we should be able to support them. I added a section to future work to describe the approach I recommend. I welcome comments and thoughts as always.

-Chris

For what it's worth, I have wanted something like your design for Equatable and Hashable in the past as well. If we're going to start blessing function types with protocol conformances based on their captures. It would be great if this approach could be extended to cover those use cases (even if as a future enhancement).

1 Like

Equality and hashable are a bit trickier for function types, because you have to decide what "equality" means for code. Is it pointer equality of the code address? What about thunks? What about duplicated code, what about merged code that happens to be identical, etc. This is one of the reasons that we didn't want closures to be comparable in general.

-Chris

2 Likes

I almost commented on that. The behavior I have always wanted is identity based on source code. Thunks should be transparent to the programming model. If a programmer happens to copy code and have two source-identical functions I would not expect those to compare equal despite having identical behavior.

1 Like

There was some previous discussion of this in the thread about closures as literals for anonymous structs. For things like SwiftUI, we'd want some closures to be equatable/hashable by code identity plus the equality of their contexts. This is consistent with how key path equality works too, which is based on declaration identity; every closure literal in source is effectively a different declaration.

1 Like

Cool. While there is some relation, I think that equatability and hashability of functions is a bit of a tangent from this thread, because the issues there don't impact the ActorSendabley of closures. That is a property strictly defined on composition of the captured elements, so thunks and other things don't affect it.

To be clear, aside from the identity issue, I'd expect other derived conformances for closures to work elementwise on the captured elements as well.

Thanks for writing this up. This is certainly an important question:

Neither the actor proposal pitched nor your proposal actually "ensure" this level of safety. The actors pitch doesn't help at all with reference types causing shared mutable state. Your pitch adds a small barrier at the type level by requiring one to conform to a protocol and promise that the type behaves nicely for actors, but this promise is unchecked and easy to get wrong.

The tl;dr here is that I feel that conformance to ActorSendable is an anti-pattern for most reference types, and so we should not use a fundamentally unsafe protocol as the marker that enables cross-actor transfer of these types.

In the long term, I think we can do better than either pitch, making it easier to work with reference types in a way that doesn't violate actor isolation, and provide proper checking. That should be our goal (we've been calling that "phase 2"), even if it's not attainable with the first introduction of actors ("phase 1"). I think the right set of goals for the phase 1 should be, e.g.:

  1. Provide a reasonable subset of types that can be used with actors to maintain actor isolation,
  2. Provide checking to notify the user when they have stepped outside of this safe subset, along with an "unsafe explicit" way to disable the checking for specific cases where the developer knows best,
  3. Minimize the amount of change to existing Swift code that isn't using actors, and
  4. Minimize the amount of disruption when moving a code base from phase 1 to phase 2.

The actors pitch fails #2, because it doesn't provide any checking for them. And it then fails #4, because "phase 2" will start complaining about code that "phase 1" allowed you to write. For me, that's the main failing of the actor pitch as it stands today: it allows you to write new actor code that is silently unsafe and we will later start rejecting.

Your proposal with ActorSendable addresses #2 by requiring types to conform to ActorSendable to be used from another actor. This works well for types that already have useful semantics for actor isolation: types that provide value semantics (discussed in depth in the ValueSemantic protocol thread), are immutable, or are internally synchronized, for example. Of course, those types will behave correctly regardless of your isolation model; the benefit of ActorSendable is in telling you that you've stepped out of that safe subset.

For types that don't have those useful semantics, the ActorSendable design encourages us to do something with them, or be excluded from the model entirely. Some subset of those types can be "deep copied" to give the illusion of value semantics:

class MyValueSemanticClassType : ActorSendable {
  func unsafeSendToActor() -> Self { self.clone() }
}

This means that passing any MyValueSemanticClassType instance across actors will involve a clone() call, which cannot be reasoned about or optimized away. However, this approximation of value semantics only occurs at actor boundaries: you can't rely on it uniformly in non-actor code, and with self. sometimes being within an actor and sometimes being cross-actor, you can't even rely on it within actor code without a deep understanding of the model.

A better answer in Swift would be to wrap up such types in a struct or enum that does copy-on-write, which is better for performance (fewer deep copies), understandability (we expect most structs/enums to have value semantics), and consistent behavior in actor vs. non-actor code (the wrapper always has value semantics).

The various categories of reference types discussed above, all together, probably don't comprise the majority of reference types. I suspect that most reference types fall into a category not really discussed: reference types that won't by themselves ever escape whatever actor they're in, but can form arbitrarily-interesting object graphs to represent (say) the data model of your program. Generally speaking, you don't want to share these across actors, but once in a while you might need to do something unsafe. This proposal suggests that you do so by conforming to the ActorSendable type:

class MyGraphNode : ActorSendable {
  func unsafeSendToActor() -> Self { self }
}

... but this is a trap. You've now disabled all of the checking benefits of ActorSendable (goal #2 above) and left yourself with something that will be hard to detangle when we get to phase 2 (goal #4 above).

What this suggests, to me, is that outside of the "easy" cases (value types that can be copy-on-write, immutable classes, internally-synchronized classes), declaring conformance to ActorSendable is an anti-pattern. It's the easy but wrong way to silence checking of cross-actor calls, and we'd be better off with annotations at specific entry points: rather than calling the MyGraphNode type always safe to use with actors, require any function that can be cross-actor and wants to traffic in MyGraphNode to explicitly mark the corresponding parameter as "unsafe".

actor class MyActor {
  func f(@UnsafelyShared _: MyGraphNode) { ... }
}

This puts the burden on the definition to be careful, and it gives us something searchable when "phase 2" comes along and makes most of these unsafely-shared values unnecessary. Whether @UnsafelyShared is just @actorIndependent(unsafe) from the actor pitch or some kind of property wrapper on a parameter, I don't yet know. Result types would need similar treatment.

To turn this into something concrete: an approach to "phase 1" would allow only value types (structs and enums) to be used cross-actor, with some way of annotating parameters and results that are "unsafely shared" to disable the checking for those declarations that need to opt out.

If the value semantics discussion were to provide a better definition of "value types", we could use that. But we don't have to, because "structs and enums" is already the proxy concept that the language depends on for value semantics, so we wouldn't be making it worse.

Doug

10 Likes

Should it be a ValueSematics protocol, or a value keyword, like class currently is when used as a protocol constraint:

protocol HasToHaveReferenceSemantics: class {...}
protocol HasToHaveValueSemantics: value {...}

vs

protocol HasToHaveReferenceSemantics: class {...}
protocol HasToHaveValueSemantics: ValueSemantic {...}
1 Like

class used in that way is sort of intended for future deprecation (maybe). See
https://github.com/apple/swift-evolution/blob/main/proposals/0156-subclass-existentials.md#class-and-anyobject
This being said, that should probably have happened for Swift 5.
Maybe it can still happen for Swift 6?

2 Likes

David: It’s an interesting idea, and more or less what this thread is grappling with (especially if : class becomes deprecated in favor of : AnyObject).

1 Like

I don't think the way you're positioning this is productive at all, because it is not something that can be achieved in practice. Swift is "safe" (loose air quotes) because uses a set of conventions and defaults that lead to the reduction of bugs in practice. It does not define away all bugs, and certainly doesn't prevent all memory safety issues -- even within a thread.

As I mentioned in the ActorSendable writeup, it is very common for Swift to use unsafe constructs (e.g. UnsafePointer) to build safe abstractions (e.g. Array).
This is exactly the approach taken by the ActorSendable proposal, and I think it does achieve what we consider to be "safety" according to the usual interpretation Swift uses.

This is an important point so I'll re-underscore the problem with your argument. If we applied to equally to all of Swift we would not allow the use of unsafe pointers because they are "unchecked and easy to get wrong". In practice, we accept that risk because library authors can build safe abstractions out of unsafe components, and those safe components can be easy to use and lead to raising the abstraction level. ActorSendable is exactly the same thing. Similar fears of misuse came up in the discussion of SE-0195 and many other proposals as well.

Also, FWIW, I have seen no alternate proposals that are compelling. The ActorSendable pitch has an assessment of what will happen if actors ship without it. Your comments seem to indicate that you agree with the problem, am I mistaken about that? Do you disagree with anything in that section?

I find this characterization to be extremely concerning. The anti-pattern here is to use reference types in a concurrent system. Implementing ActorSendable for a reference type requires implementing an explicitly marked "unsafe" conformance. This is very consistent with Swift's general approach to these sorts of things: unsafe features are make available for power users, but we are careful to make it explicit that they are unsafe, and difficult to accidentally stumble upon (something we haven't discussed, but is easy to do with this feature). I don't think it is fair to say that any of the unsafe features in swift (pointers, atomics, ....) are "easy to use" or recommended for novice programmers.

I agree that it is possible to extend Swift and introduce new type system features that allow capture more of the interesting design space in a memory safe way. Such a move in the future will compose cleanly on top of this, and can roll out over time. Are you suggesting that we block actors until those features are available?

Yes, I completely agree with this assessment. :100: I consider this to be a showstopper for the actors proposal that needs to be addressed.

I agree with what you're writing here, but I'd suggest a different interpretation. When you contrast the ActorSendable pitch to "actors without ActorSendable", the value isn't what it allows -- it is what it does not allow. The main point of ActorSendable is that users will get a /compile time error/ when they try to pass unsafe reference types across actor boundaries. This is a really important feature that should be included in Actors 1.0, because otherwise users will write tons of bad code and Actors will exacerbate race conditions. It exacerbates them by introducing a new programming model that encourages more concurrent code, but without providing any of the guard rails or safety that people deserve with Actors.

ActorSendable is just an implementation approach that provides those guardrails. Again, it is safe per the usual Swift interpretation of that word.

Yes, emphasis on the later half of that sentence. Both approaches enable the safe things - the point of ActorSendable is to reject the unsafe ones.

Sure, the concern about the cost of implicit deep copies is an important one, and (just to clarify) there is nothing in the proposal that makes deep copies implicit. The proposal observes that this is a relevant part of the design space that may be appropriate for some types. It allows the author of the type to make the assessment of whether an implicit deep copy is an appropriate thing to do or not. This can vary a lot across types, and I don't think the language needs to make a by-fiat decision for all reference types.

Since you prefer explicitness, I'll point out that there is a good design pattern that works naturally with ActorSendable that achieves your goal of explicitness. For example, instead of making NSMutableDictionary conform to ActorSendable, it would be reasonable to use a generic struct wrapper to make it an explicit part of the API so clients know there is a cost to it:

  func doThing(param: NSDeepCopy<NSMutableDictionary>) async

As mentioned in the proposal, this composes well with an active extension pitch about property wrappers, which would allow a very nice API of:

  func doThing(@NSDeepCopy param: NSMutableDictionary) async

Both approaches work with the ActorSendable design, but these are not the topic of the ActorSendable proposal itself, these should be explored as a follow-on.

Sure, I'm a big fan of value types, but we also have to embrace the real world - even when it is unfortunate. As you know, your suggested way of handling this is also compatible with ActorSendable. The proposal explains that ActorSendable is good both for the "things we like" in Swift as well as the "things we regret but have to work with" in the ecosystem.

No, the proposal does not suggest that (if it does, tell me where and I'll fix it), and I agree with you that this is not what you want.

The right thing to do here is to standardize a generic UnsafeTransfer struct as mentioned in the proposal.

As we've been discussing this allows the programmer to know that they just implemented an unsafe API. This is something we discourage (through Unsafe naming) but allow because we want people to be able to "get stuff done".

I think you're missing the most important part of the proposal: it just works for the most important and common cases that Swift encourages: compositions of value semantic types: structs, tuples, enums. We want friction with reference types, because they are inherently dangerous in concurrent system. We do not want friction using value semantic types and compositions of them. This is what the proposal achieves.

The problem with this is that it does not allow advanced library authors to provide high quality and usable APIs. It turns out that some shared reference semantic data structures really are the right thing to use for various domains (e.g. immutable classes, the reference semantic concurrent hash table, etc). Properly constructed, there is absolutely nothing unsafe about transferring references to them across actors, so requiring an attribute on the callsite would be really bad for Swift expressivity.

One of the major important points about Swift design philosophy is that we allow library authors to deploy "high technology" and unsafe features in the internal construction of APIs that provide a safe and easy to use API. Array is one example of this, ConcurrentHashTable is just another one. We shouldn't penalize classes and other reference semantic types here, we should empower the type author to decide the right thing for their types.

If you have a concrete approach to consider, then please write it up so we can evaluate it. One request is that we need a way to allow that value types are implemented in terms of reference types, and I don't think it is acceptable to put complexity into the use-site for this. Array is one example that clearly has to be in-model here. I don't see how ConcurrentHashTable is any different.

-Chris

8 Likes

You don't use unsafe pointers everywhere, every day: Array is a carefully designed and tested component that hides the unsafely behind a single abstraction. Conformance to ActorSendable is going to be far more common, less tested, and more dangerous.

It's far more wide-reaching that most of these unsafe features. One explicit annotation and the entire type---and everything built from it---is now easy to use unsafely, everywhere.

I'm assume that last question is rhetorical, but in case it isn't: no, we should not block actors until we have those type system features, but neither should we implement a solution that makes it harder to introduce those features at a later point.

This is a very narrow distinction. The proposal introduces implicit calls to unsafeSendToActor for cross-actor references. Some implementations will do these deep copies (because that's the only way to get into the actor model)---and it's up to you to trace through the scattered unsafeSendToActor implementations to see where it happens.

Sure, the explicitness here is good. Here's one way to think of my suggestion: allow the above, but take away ActorSendable and its ad hoc unsafeSendToActor.

The proposal calls this "unwise", then proceeds to give it a standard library helper protocol (UnsafeActorSendableBySelfCopy).

Yes, we agree on a generic UnsafeTransfer wrapper struct as a reasonable solution/

Despite your repeated insistence to the contrary, I understand what you're trying to do with the proposal. We even agree that reference types should have friction. However, having reference types introduce ActorSendable conformances with ad hoc unsafeSendToActor implementations that we cannot reason about puts us in a place where we won't be able to improve the model later---we'll be locked into all of these implicit unsafeSendToActor calls everywhere.

Sure. The concurrency roadmap talks about these things as well, and we could apply notions that exist in the proposal I posted (e.g., @actorIndependent) as indicators that certain types are safe to transfer---without injecting implicit calls at actor boundaries.

Doug

1 Like

@Douglas_Gregor: If I understand you correctly, then, your objection to the ActorSendable design isn't that any one particular type would have access to an implicit unsafeSendToActor, but that the implication here is that every type that participates in the actor model would (by implicit or explicit conformance) have this unsafeSendToActor that is hard to reason about and optimize away which is implicitly called at actor boundaries?

I think it would be possible to reconcile your two points (allowing advanced library authors to access this unsafe feature, but not having an unsafe feature be pervasive throughout the actor model) by adopting a CustomStringConvertible-like design:

Recall that in Swift every type is "string convertible" without conforming to any protocol and doesn't need to have a description property; however, custom string conversion can be had by conforming to a protocol and performing arbitrary work in the implementation of description.

Would it then answer your objection if the design proposed were that we have a UnsafeCustomSendable protocol instead? By this I mean, all value types could be implicitly sendable without having any implicit call to unsafeSendToActor at actor boundaries. We could provide UnsafeTransfer wrappers, NSDeepCopy wrappers, etc. for users to annotate at the call site, which would also do work at actor boundaries that the compiler can reason about. Then, besides those options and not as their underlying implementation, types (classes or otherwise) could opt into a custom unsafe feature which allows them to be sendable by performing arbitrary work at actor boundaries. This would mean that unsafeSendToActor would not be pervasive in the actor model but rather a limited facility to provide power only where absolutely needed.

1 Like

I must be misunderstanding something about your post. You seem to suggest that all value types would be sendable by default. This is most certainly not what we want! As a degenerate example, UnsafeMutablePointer is a struct and should definitely not be sendable. The author should not need to do something to opt-out of sendability when defining such a type (for example, a struct that contains UnsafeMutablePointers).

I must be misunderstanding this conversation then. I had thought that this part was completely settled:

  • Properties of an actor will be protected by the actor.
  • Immutable memory (such as a let constant), local memory (such as a local variable that’s never captured), and value component memory (such as a properties of a struct, or an enum case), are already protected from data races.
  • Unsafe memory (such as an arbitrary allocation referenced by an UnsafeMutablePointer ) is associated with an unsafe abstraction. It’s actively undesirable to try to force these abstractions to be used safely, because these abstractions are meant to be usable to bypass safe language rules when necessary. Instead, we have to trust the programmer to use these correctly.
  • Global memory (such as a global or static variable) can in principle be accessed by any code anywhere, so is subject to data races.
  • Class component memory can also be accessed from any code that hold a reference to the class. This means that while the reference to the class may be protected by an actor, passing that reference between actors exposes its properties to data races. This also includes references to classes held within value types, when these are passed between actors.

The goal of full actor isolation is to ensure that these last two categorizations [emphasis added] are protected by default.

@Douglas_Gregor I seems like your argument against adopting protocol-based actor isolation hinges on this assertion. Could you elaborate with evidence?

@Chris_Lattner3 It will certainly be painful if/when the "things we regret but have to work with" have a broken ActorSendable conformance. Could you elaborate on how library authors (and users) would debug these sorts of issues? How would they see where the problem lives? Would they need to trace through the "scattered unsafeSendToActor implementations" as Doug suggests?

1 Like

My fundamental complaint is about the existence of unsafeSendToActor. It's bad for code size to be emitting calls to unsafeSendToActor for every cross-actor call, it's bad for performance to have those implementations go recursively call unsafeSendToActor on everything in the value graph because there might be on non-trivial unsafeSendToActor somewhere in there, and it's bad for understandability because a human can't reason about that pile of recursive calls, either, or guess at where all of the implicit calls to unsafeSendToActor go.

I'm with you here, up until...

Ah, but generic code needs to be written against the most general thing, which could have a non-trivial unsafeSendToActor code. So we could try to limit the code size/performation problems for value types, but that doesn't carry over to a generic T that can be used with reference types.

I don't think we should be performing implicit, arbitrary work at actor boundaries for any type. Let's consider the "good" cases for actor sendable: value types that have value semantics, immutable classes, internally-synchronized concurrent data structures. All of these just need self. No custom unsafeSendToActor.

"Deep copying" a class type would need to do arbitrary work in unsafeSendToActor, but I don't think that's a good solution: it would be better to wrap it up in a copy-on-write value type so we're not doing extraneous copies. That's better overall, and doesn't need any special logic in unsafeSendToActor.
If you truly want to "deep copy" when calling an actor function, it can be handled some other way, e.g., the NSDeepCopy wrapper you mention, not with implicit magic.

What other uses do people have for unsafeSendToActor? The costs of supporting it are (very) high, so I think it should not exist.

That makes ActorSendable, effectively, a marker protocol that means "it's okay to pass this type across actors." That design might be okay, but I think we should contrast it against (say) adding an attribute, which would have no runtime impact and we might already need for closures.

I've been asked to come up with a counter proposal, and haven't had time to do so yet. As a start: eliminate unsafeSendToActor and the implicit calls to it.

Doug

5 Likes