Pitch: Protocol-based Actor Isolation

Equality and hashable are a bit trickier for function types, because you have to decide what "equality" means for code. Is it pointer equality of the code address? What about thunks? What about duplicated code, what about merged code that happens to be identical, etc. This is one of the reasons that we didn't want closures to be comparable in general.

-Chris

2 Likes

I almost commented on that. The behavior I have always wanted is identity based on source code. Thunks should be transparent to the programming model. If a programmer happens to copy code and have two source-identical functions I would not expect those to compare equal despite having identical behavior.

1 Like

There was some previous discussion of this in the thread about closures as literals for anonymous structs. For things like SwiftUI, we'd want some closures to be equatable/hashable by code identity plus the equality of their contexts. This is consistent with how key path equality works too, which is based on declaration identity; every closure literal in source is effectively a different declaration.

1 Like

Cool. While there is some relation, I think that equatability and hashability of functions is a bit of a tangent from this thread, because the issues there don't impact the ActorSendabley of closures. That is a property strictly defined on composition of the captured elements, so thunks and other things don't affect it.

To be clear, aside from the identity issue, I'd expect other derived conformances for closures to work elementwise on the captured elements as well.

Thanks for writing this up. This is certainly an important question:

Neither the actor proposal pitched nor your proposal actually "ensure" this level of safety. The actors pitch doesn't help at all with reference types causing shared mutable state. Your pitch adds a small barrier at the type level by requiring one to conform to a protocol and promise that the type behaves nicely for actors, but this promise is unchecked and easy to get wrong.

The tl;dr here is that I feel that conformance to ActorSendable is an anti-pattern for most reference types, and so we should not use a fundamentally unsafe protocol as the marker that enables cross-actor transfer of these types.

In the long term, I think we can do better than either pitch, making it easier to work with reference types in a way that doesn't violate actor isolation, and provide proper checking. That should be our goal (we've been calling that "phase 2"), even if it's not attainable with the first introduction of actors ("phase 1"). I think the right set of goals for the phase 1 should be, e.g.:

  1. Provide a reasonable subset of types that can be used with actors to maintain actor isolation,
  2. Provide checking to notify the user when they have stepped outside of this safe subset, along with an "unsafe explicit" way to disable the checking for specific cases where the developer knows best,
  3. Minimize the amount of change to existing Swift code that isn't using actors, and
  4. Minimize the amount of disruption when moving a code base from phase 1 to phase 2.

The actors pitch fails #2, because it doesn't provide any checking for them. And it then fails #4, because "phase 2" will start complaining about code that "phase 1" allowed you to write. For me, that's the main failing of the actor pitch as it stands today: it allows you to write new actor code that is silently unsafe and we will later start rejecting.

Your proposal with ActorSendable addresses #2 by requiring types to conform to ActorSendable to be used from another actor. This works well for types that already have useful semantics for actor isolation: types that provide value semantics (discussed in depth in the ValueSemantic protocol thread), are immutable, or are internally synchronized, for example. Of course, those types will behave correctly regardless of your isolation model; the benefit of ActorSendable is in telling you that you've stepped out of that safe subset.

For types that don't have those useful semantics, the ActorSendable design encourages us to do something with them, or be excluded from the model entirely. Some subset of those types can be "deep copied" to give the illusion of value semantics:

class MyValueSemanticClassType : ActorSendable {
  func unsafeSendToActor() -> Self { self.clone() }
}

This means that passing any MyValueSemanticClassType instance across actors will involve a clone() call, which cannot be reasoned about or optimized away. However, this approximation of value semantics only occurs at actor boundaries: you can't rely on it uniformly in non-actor code, and with self. sometimes being within an actor and sometimes being cross-actor, you can't even rely on it within actor code without a deep understanding of the model.

A better answer in Swift would be to wrap up such types in a struct or enum that does copy-on-write, which is better for performance (fewer deep copies), understandability (we expect most structs/enums to have value semantics), and consistent behavior in actor vs. non-actor code (the wrapper always has value semantics).

The various categories of reference types discussed above, all together, probably don't comprise the majority of reference types. I suspect that most reference types fall into a category not really discussed: reference types that won't by themselves ever escape whatever actor they're in, but can form arbitrarily-interesting object graphs to represent (say) the data model of your program. Generally speaking, you don't want to share these across actors, but once in a while you might need to do something unsafe. This proposal suggests that you do so by conforming to the ActorSendable type:

class MyGraphNode : ActorSendable {
  func unsafeSendToActor() -> Self { self }
}

... but this is a trap. You've now disabled all of the checking benefits of ActorSendable (goal #2 above) and left yourself with something that will be hard to detangle when we get to phase 2 (goal #4 above).

What this suggests, to me, is that outside of the "easy" cases (value types that can be copy-on-write, immutable classes, internally-synchronized classes), declaring conformance to ActorSendable is an anti-pattern. It's the easy but wrong way to silence checking of cross-actor calls, and we'd be better off with annotations at specific entry points: rather than calling the MyGraphNode type always safe to use with actors, require any function that can be cross-actor and wants to traffic in MyGraphNode to explicitly mark the corresponding parameter as "unsafe".

actor class MyActor {
  func f(@UnsafelyShared _: MyGraphNode) { ... }
}

This puts the burden on the definition to be careful, and it gives us something searchable when "phase 2" comes along and makes most of these unsafely-shared values unnecessary. Whether @UnsafelyShared is just @actorIndependent(unsafe) from the actor pitch or some kind of property wrapper on a parameter, I don't yet know. Result types would need similar treatment.

To turn this into something concrete: an approach to "phase 1" would allow only value types (structs and enums) to be used cross-actor, with some way of annotating parameters and results that are "unsafely shared" to disable the checking for those declarations that need to opt out.

If the value semantics discussion were to provide a better definition of "value types", we could use that. But we don't have to, because "structs and enums" is already the proxy concept that the language depends on for value semantics, so we wouldn't be making it worse.

Doug

10 Likes

Should it be a ValueSematics protocol, or a value keyword, like class currently is when used as a protocol constraint:

protocol HasToHaveReferenceSemantics: class {...}
protocol HasToHaveValueSemantics: value {...}

vs

protocol HasToHaveReferenceSemantics: class {...}
protocol HasToHaveValueSemantics: ValueSemantic {...}
1 Like

class used in that way is sort of intended for future deprecation (maybe). See
https://github.com/apple/swift-evolution/blob/main/proposals/0156-subclass-existentials.md#class-and-anyobject
This being said, that should probably have happened for Swift 5.
Maybe it can still happen for Swift 6?

2 Likes

David: It’s an interesting idea, and more or less what this thread is grappling with (especially if : class becomes deprecated in favor of : AnyObject).

1 Like

I don't think the way you're positioning this is productive at all, because it is not something that can be achieved in practice. Swift is "safe" (loose air quotes) because uses a set of conventions and defaults that lead to the reduction of bugs in practice. It does not define away all bugs, and certainly doesn't prevent all memory safety issues -- even within a thread.

As I mentioned in the ActorSendable writeup, it is very common for Swift to use unsafe constructs (e.g. UnsafePointer) to build safe abstractions (e.g. Array).
This is exactly the approach taken by the ActorSendable proposal, and I think it does achieve what we consider to be "safety" according to the usual interpretation Swift uses.

This is an important point so I'll re-underscore the problem with your argument. If we applied to equally to all of Swift we would not allow the use of unsafe pointers because they are "unchecked and easy to get wrong". In practice, we accept that risk because library authors can build safe abstractions out of unsafe components, and those safe components can be easy to use and lead to raising the abstraction level. ActorSendable is exactly the same thing. Similar fears of misuse came up in the discussion of SE-0195 and many other proposals as well.

Also, FWIW, I have seen no alternate proposals that are compelling. The ActorSendable pitch has an assessment of what will happen if actors ship without it. Your comments seem to indicate that you agree with the problem, am I mistaken about that? Do you disagree with anything in that section?

I find this characterization to be extremely concerning. The anti-pattern here is to use reference types in a concurrent system. Implementing ActorSendable for a reference type requires implementing an explicitly marked "unsafe" conformance. This is very consistent with Swift's general approach to these sorts of things: unsafe features are make available for power users, but we are careful to make it explicit that they are unsafe, and difficult to accidentally stumble upon (something we haven't discussed, but is easy to do with this feature). I don't think it is fair to say that any of the unsafe features in swift (pointers, atomics, ....) are "easy to use" or recommended for novice programmers.

I agree that it is possible to extend Swift and introduce new type system features that allow capture more of the interesting design space in a memory safe way. Such a move in the future will compose cleanly on top of this, and can roll out over time. Are you suggesting that we block actors until those features are available?

Yes, I completely agree with this assessment. :100: I consider this to be a showstopper for the actors proposal that needs to be addressed.

I agree with what you're writing here, but I'd suggest a different interpretation. When you contrast the ActorSendable pitch to "actors without ActorSendable", the value isn't what it allows -- it is what it does not allow. The main point of ActorSendable is that users will get a /compile time error/ when they try to pass unsafe reference types across actor boundaries. This is a really important feature that should be included in Actors 1.0, because otherwise users will write tons of bad code and Actors will exacerbate race conditions. It exacerbates them by introducing a new programming model that encourages more concurrent code, but without providing any of the guard rails or safety that people deserve with Actors.

ActorSendable is just an implementation approach that provides those guardrails. Again, it is safe per the usual Swift interpretation of that word.

Yes, emphasis on the later half of that sentence. Both approaches enable the safe things - the point of ActorSendable is to reject the unsafe ones.

Sure, the concern about the cost of implicit deep copies is an important one, and (just to clarify) there is nothing in the proposal that makes deep copies implicit. The proposal observes that this is a relevant part of the design space that may be appropriate for some types. It allows the author of the type to make the assessment of whether an implicit deep copy is an appropriate thing to do or not. This can vary a lot across types, and I don't think the language needs to make a by-fiat decision for all reference types.

Since you prefer explicitness, I'll point out that there is a good design pattern that works naturally with ActorSendable that achieves your goal of explicitness. For example, instead of making NSMutableDictionary conform to ActorSendable, it would be reasonable to use a generic struct wrapper to make it an explicit part of the API so clients know there is a cost to it:

  func doThing(param: NSDeepCopy<NSMutableDictionary>) async

As mentioned in the proposal, this composes well with an active extension pitch about property wrappers, which would allow a very nice API of:

  func doThing(@NSDeepCopy param: NSMutableDictionary) async

Both approaches work with the ActorSendable design, but these are not the topic of the ActorSendable proposal itself, these should be explored as a follow-on.

Sure, I'm a big fan of value types, but we also have to embrace the real world - even when it is unfortunate. As you know, your suggested way of handling this is also compatible with ActorSendable. The proposal explains that ActorSendable is good both for the "things we like" in Swift as well as the "things we regret but have to work with" in the ecosystem.

No, the proposal does not suggest that (if it does, tell me where and I'll fix it), and I agree with you that this is not what you want.

The right thing to do here is to standardize a generic UnsafeTransfer struct as mentioned in the proposal.

As we've been discussing this allows the programmer to know that they just implemented an unsafe API. This is something we discourage (through Unsafe naming) but allow because we want people to be able to "get stuff done".

I think you're missing the most important part of the proposal: it just works for the most important and common cases that Swift encourages: compositions of value semantic types: structs, tuples, enums. We want friction with reference types, because they are inherently dangerous in concurrent system. We do not want friction using value semantic types and compositions of them. This is what the proposal achieves.

The problem with this is that it does not allow advanced library authors to provide high quality and usable APIs. It turns out that some shared reference semantic data structures really are the right thing to use for various domains (e.g. immutable classes, the reference semantic concurrent hash table, etc). Properly constructed, there is absolutely nothing unsafe about transferring references to them across actors, so requiring an attribute on the callsite would be really bad for Swift expressivity.

One of the major important points about Swift design philosophy is that we allow library authors to deploy "high technology" and unsafe features in the internal construction of APIs that provide a safe and easy to use API. Array is one example of this, ConcurrentHashTable is just another one. We shouldn't penalize classes and other reference semantic types here, we should empower the type author to decide the right thing for their types.

If you have a concrete approach to consider, then please write it up so we can evaluate it. One request is that we need a way to allow that value types are implemented in terms of reference types, and I don't think it is acceptable to put complexity into the use-site for this. Array is one example that clearly has to be in-model here. I don't see how ConcurrentHashTable is any different.

-Chris

8 Likes

You don't use unsafe pointers everywhere, every day: Array is a carefully designed and tested component that hides the unsafely behind a single abstraction. Conformance to ActorSendable is going to be far more common, less tested, and more dangerous.

It's far more wide-reaching that most of these unsafe features. One explicit annotation and the entire type---and everything built from it---is now easy to use unsafely, everywhere.

I'm assume that last question is rhetorical, but in case it isn't: no, we should not block actors until we have those type system features, but neither should we implement a solution that makes it harder to introduce those features at a later point.

This is a very narrow distinction. The proposal introduces implicit calls to unsafeSendToActor for cross-actor references. Some implementations will do these deep copies (because that's the only way to get into the actor model)---and it's up to you to trace through the scattered unsafeSendToActor implementations to see where it happens.

Sure, the explicitness here is good. Here's one way to think of my suggestion: allow the above, but take away ActorSendable and its ad hoc unsafeSendToActor.

The proposal calls this "unwise", then proceeds to give it a standard library helper protocol (UnsafeActorSendableBySelfCopy).

Yes, we agree on a generic UnsafeTransfer wrapper struct as a reasonable solution/

Despite your repeated insistence to the contrary, I understand what you're trying to do with the proposal. We even agree that reference types should have friction. However, having reference types introduce ActorSendable conformances with ad hoc unsafeSendToActor implementations that we cannot reason about puts us in a place where we won't be able to improve the model later---we'll be locked into all of these implicit unsafeSendToActor calls everywhere.

Sure. The concurrency roadmap talks about these things as well, and we could apply notions that exist in the proposal I posted (e.g., @actorIndependent) as indicators that certain types are safe to transfer---without injecting implicit calls at actor boundaries.

Doug

1 Like

@Douglas_Gregor: If I understand you correctly, then, your objection to the ActorSendable design isn't that any one particular type would have access to an implicit unsafeSendToActor, but that the implication here is that every type that participates in the actor model would (by implicit or explicit conformance) have this unsafeSendToActor that is hard to reason about and optimize away which is implicitly called at actor boundaries?

I think it would be possible to reconcile your two points (allowing advanced library authors to access this unsafe feature, but not having an unsafe feature be pervasive throughout the actor model) by adopting a CustomStringConvertible-like design:

Recall that in Swift every type is "string convertible" without conforming to any protocol and doesn't need to have a description property; however, custom string conversion can be had by conforming to a protocol and performing arbitrary work in the implementation of description.

Would it then answer your objection if the design proposed were that we have a UnsafeCustomSendable protocol instead? By this I mean, all value types could be implicitly sendable without having any implicit call to unsafeSendToActor at actor boundaries. We could provide UnsafeTransfer wrappers, NSDeepCopy wrappers, etc. for users to annotate at the call site, which would also do work at actor boundaries that the compiler can reason about. Then, besides those options and not as their underlying implementation, types (classes or otherwise) could opt into a custom unsafe feature which allows them to be sendable by performing arbitrary work at actor boundaries. This would mean that unsafeSendToActor would not be pervasive in the actor model but rather a limited facility to provide power only where absolutely needed.

1 Like

I must be misunderstanding something about your post. You seem to suggest that all value types would be sendable by default. This is most certainly not what we want! As a degenerate example, UnsafeMutablePointer is a struct and should definitely not be sendable. The author should not need to do something to opt-out of sendability when defining such a type (for example, a struct that contains UnsafeMutablePointers).

I must be misunderstanding this conversation then. I had thought that this part was completely settled:

  • Properties of an actor will be protected by the actor.
  • Immutable memory (such as a let constant), local memory (such as a local variable that’s never captured), and value component memory (such as a properties of a struct, or an enum case), are already protected from data races.
  • Unsafe memory (such as an arbitrary allocation referenced by an UnsafeMutablePointer ) is associated with an unsafe abstraction. It’s actively undesirable to try to force these abstractions to be used safely, because these abstractions are meant to be usable to bypass safe language rules when necessary. Instead, we have to trust the programmer to use these correctly.
  • Global memory (such as a global or static variable) can in principle be accessed by any code anywhere, so is subject to data races.
  • Class component memory can also be accessed from any code that hold a reference to the class. This means that while the reference to the class may be protected by an actor, passing that reference between actors exposes its properties to data races. This also includes references to classes held within value types, when these are passed between actors.

The goal of full actor isolation is to ensure that these last two categorizations [emphasis added] are protected by default.

@Douglas_Gregor I seems like your argument against adopting protocol-based actor isolation hinges on this assertion. Could you elaborate with evidence?

@Chris_Lattner3 It will certainly be painful if/when the "things we regret but have to work with" have a broken ActorSendable conformance. Could you elaborate on how library authors (and users) would debug these sorts of issues? How would they see where the problem lives? Would they need to trace through the "scattered unsafeSendToActor implementations" as Doug suggests?

1 Like

My fundamental complaint is about the existence of unsafeSendToActor. It's bad for code size to be emitting calls to unsafeSendToActor for every cross-actor call, it's bad for performance to have those implementations go recursively call unsafeSendToActor on everything in the value graph because there might be on non-trivial unsafeSendToActor somewhere in there, and it's bad for understandability because a human can't reason about that pile of recursive calls, either, or guess at where all of the implicit calls to unsafeSendToActor go.

I'm with you here, up until...

Ah, but generic code needs to be written against the most general thing, which could have a non-trivial unsafeSendToActor code. So we could try to limit the code size/performation problems for value types, but that doesn't carry over to a generic T that can be used with reference types.

I don't think we should be performing implicit, arbitrary work at actor boundaries for any type. Let's consider the "good" cases for actor sendable: value types that have value semantics, immutable classes, internally-synchronized concurrent data structures. All of these just need self. No custom unsafeSendToActor.

"Deep copying" a class type would need to do arbitrary work in unsafeSendToActor, but I don't think that's a good solution: it would be better to wrap it up in a copy-on-write value type so we're not doing extraneous copies. That's better overall, and doesn't need any special logic in unsafeSendToActor.
If you truly want to "deep copy" when calling an actor function, it can be handled some other way, e.g., the NSDeepCopy wrapper you mention, not with implicit magic.

What other uses do people have for unsafeSendToActor? The costs of supporting it are (very) high, so I think it should not exist.

That makes ActorSendable, effectively, a marker protocol that means "it's okay to pass this type across actors." That design might be okay, but I think we should contrast it against (say) adding an attribute, which would have no runtime impact and we might already need for closures.

I've been asked to come up with a counter proposal, and haven't had time to do so yet. As a start: eliminate unsafeSendToActor and the implicit calls to it.

Doug

5 Likes

I agree with Doug. Values have "value semantics" when there's no (or at least minimal) high-level semantic difference between a value-copy and a deep-copy. But if there's a high-level semantic difference between these kinds of copies, we shouldn't be doing a deep copy implicitly; readers will not anticipate the semantic impact of a deep copy when they see something that just looks like an asynchronous call.

Now, I can imagine types for which implicit deep copies would be both necessary and semantically acceptable. To be sharable at all with equivalent semantics, they would have to have value-like semantics and not be inherently semantically tied to a single thread/actor. So there would have to be something about their representation that's not safe to share across threads/actors, which basically means using non-atomic reference counts — I don't know what else this could be, really. To me, it's hard to argue that that kind of opt-in optimization is worth complicating the entire async model over, since people using it could still presumably use whatever facilities there are to request explicit deep copies. (It also creates new performance problems — we'd sacrifice a lot of code size to these deep-copy operations, and it would require real heroics to turn a deep copy back into a value copy when e.g. passing an Array over an async boundary.) That is, unless you're suggesting that we should start doing this sort of optimization in the core library types like Array, which I do not think would be the right performance trade-off even if it weren't problematic for the ABI.

If deep copies have to be explicit somehow, then we still need a separate language mechanism that lets you pass most value types without that explicit step. It really feels like that mechanism should be at the center of the async restriction rather than the deep-copy mechanism, as long as whatever way we come up with for doing a deep copy lets you pass the result safely.

10 Likes

In the formulation that I ask about above, there would be no such generic code, because there would be no ActorSendable (just as there is no StringConvertible), only a protocol to opt into customized “actor-sendability” (à la CustomStringConvertible).

Right, as noted above, I’m interested in how you’d feel about the option of not having a marker protocol for “actor-sendability,” but rather (as in the case of CustomStringConvertible) exposing only a protocol for custom implicit work at actor boundaries. To my understanding, supporting such customizability is what @Chris_Lattner3 wants to make sure is available from the get-go, so that classes that can’t be supported otherwise aren’t just left out of the actor isolation story as a potential correctness footgun.

So my question is whether the cost of exposing such a facility is high if formulated as described above, as an opt-in for custom work and not as a marker protocol.

Put another way, is your objection based on a view that no type should be able to perform implicit work at actor boundaries, or that you wouldn’t want a design where supporting that possibility would lead to pervasive questions as to when and where there could be arbitrary work for any type? I’m wondering because I think the latter issue is separable from the former while still supporting types that require such customizability, if designed in the way described above—would you agree?

1 Like

I have been thinking about this proposal, and while I very much appreciate the intent to make reference types safer to use in concurrent code, I'm afraid that the well-lit path created by this proposal will lead users to do the wrong thing.

If someone has a reference type, and they want to make it usable across actors, they are encouraged to implement unsafeSendToActor() that performs a deep copy. With that, the reference type pretends to be a value type, but only when passed across actor boundaries. I don't think that is the right thing to do in the common case. I think that if the user has defined a reference type, it is because they wanted shared mutable state. Un-sharing that state when passing an instance across an actor boundary wouldn't be correct.

Furthermore, types that implement unsafeSendToActor() as deep copy have different semantics when passed to sync and async functions. When passed to sync functions, modifications done by the callee are visible to the caller. When passed to async functions, modifications are not visible. I think that's quite subtle and surprising, given the amount of effort the async/await proposals spend on trying to make sync and async function calls look and feel similar to each other.

15 Likes

Doug, the crux of your concern seems to be that people will incorrectly conform types to ActorSendable and then go out of their way to implement unsafeSendToActor incorrectly. Why are you concerned about this?

This is equivalent in my mind to the early concerns that people would just use T! (ImplicitlyUnwrappedOptional) everywhere to not have to think about nullability. In practice, every blog post and article everywhere did a good job explaining the issue and while some code did bad things (given a community the size of Swift, this isn't unexpected) the community quickly figured things out, learned new things, and still used IUO where appropriate (e.g. interacting with legacy C code).

There was a similarly concern when we were discussing DynamicMemberLookup - that people would conform types to it inappropriately and the world would come unraveled. This also didn't happen.

Coming back to this proposal, there is some chance of incorrect implementation and bugs (see below) but I haven't seen a counter proposal of how to solve these problems in a better way. You made some claims in your previous post that you had a model in mind for how to fix this in a better way - can you elaborate on that with a baked model? I consider the actor proposal without a solution for this problem to be a complete non-starter (rationale in the proposal).

Good question: an incorrect conformance turns into an incorrectly shared reference between actors, which turns into race conditions and other memory safety problems. This would have to be debugged using the same techniques one uses today to diagnose this.

Keep in mind that there is no proposal on the table that addresses this concern. The base proposal without this pitch has this BY DEFAULT for every reference type with zero protection at all. In contrast, the ActorSendable proposal requires you to go out of your way to incorrectly implement an unsafe protocol requirement to achieve the bug. This pitch makes the actor model far far safer and less prone to encountering problems in the wild than the actor proposal without it.

There is no alternative proposal on the table to address this in a better way. If there were, then we could have a nice tradeoff discussion between them. The current counterproposal is gross unsafety in all actor code everywhere.

I think this is a pretty significant misunderstanding of the model here ((edit: it turns out that it was my misunderstanding, see the post 66 below!)). In the vastly most common case, the implementation of the hook is trivial and inlined, there should be zero overhead for normal value types. If you are worried about the uncommon cases, then we could have a nice concrete discussion about adding new witnesses for this or something. This is all solvable if you think that performance is an issue.

I agree that deepcopy is not the right thing for arbitrary graphs and an explicit @NSDeepCopy thing is a good way to go in general. My personal belief is that the language should allow the type author to do the right thing for their type since they have the best knowledge about what is right for their type. Also, while I'm a huge believer in COW, it is important to support legacy code and other design patterns. Again that doesn't mean that the copy has to be implicit! :slight_smile:

Let me underscore that again: 99% of my concern is covered by being able to transparently share internally synchronized class references across actor boundaries, combined with the ability to define struct wrappers like NSDeepCopy and UnsafeTransfer that allow explicit APIs for weird cases. My concern about deep copying is only a 1% concern, and I would happily throw that under the bus if there is a better model that covers the important cases!!

-Chris

1 Like