SE-0336: Distributed actor isolation

Joe_Groff · December 8, 2021, 9:59pm

Hi everyone. The review of SE-0336, "Distributed Actor Isolation", begins now and runs through December 22, 2021.

Reviews are an important part of the Swift evolution process. All review feedback should be either on this forum thread or, if you would like to keep your feedback private, directly to the review manager. When messaging the review manager directly, please keep the proposal link at the top of the message.

What goes into a review?

The goal of the review process is to improve the proposal under review through constructive criticism and, eventually, determine the direction of Swift. When writing your review, here are some questions you might want to answer in your review:

What is your evaluation of the proposal?
Is the problem being addressed significant enough to warrant a change to Swift?
Does this proposal fit well with the feel and direction of Swift?
If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?
How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

More information about the Swift evolution process is available at

https://github.com/apple/swift-evolution/blob/master/process.md

Thanks!

Joe Groff
Review Manager

ktoso · December 8, 2021, 10:11pm

And for those who might enjoy a small high-level walkthrough via a video, we just published a conference talk about the feature as well: [Video] Distributed Actors announced at Scale by the Bay

This proposal though focuses only on the isolation and type-checking aspects though, not the runtime (that'll be another proposal).

Thanks in advance for any review comments

bobergj · December 9, 2021, 2:02am

Remote and Local Distributed Actors
the call-sites of distributed methods have implicitly gained the async and throwing effects, which means that we must invoke them with try await dist.<invocation> This is an important aspect of the design, as it allows us to surface any potential network issues that might occur during these calls, such as timeouts, network failures or other issues that may have caused these calls to fail.

Does the actual error type thrown for timeouts, network and serialization errors depend on the concrete DistributedActorSystem?
With the stated goals of archiving location transparency, and the stated benefit of "Distributed actors can be used with multiple transports without changing the actor's implementation": it seems to me that there should be a finite set of such error types (wrapping underlying errors), so that the caller can catch them. Those types should then be part of this proposal, as they are intrinsic to the distributed actor interface.

Distributed Method Serialization Requirements
Most frequently, the serialization requirement is going to be Codable , so for the rest of this proposal we'll focus mostly on this use-case

Codable is mentioned many times in this proposal, while alternatives such as protocol buffers are mentioned once or twice in passing. This gives me the impression that, initially, the primary use-case for this is XPC between local processes on Apple OSs. For example communication between system processes running at different privilege levels, between an app and its helper processes launched for isolation purposes, or between an app and an "app extension".

Which is fine, and it would be great to have this for those use cases!

At the same time, the set of failure modes for communication between local processes is very different to communication in distributed systems over the internet. I feel like this proposal is painting a too rosy a picture of being able to take an actor implementation that works in a local-only environment, with one set of failure modes, and have it working it in another environment without any changes whatsoever.

This is related to the point above about errors. For good and bad, Swift throws are untyped, so such errors are not part of the type signature or documentation of a distributed method. The caller of an actor may not be aware of, and is not going be forced to handle, errors that can only occur in a network-distributed system.

slashmo · December 9, 2021, 9:50am

I think this issue has already been tackled. From skimming the proposal, especially the "Distributed Actors implicitly conform to Codable" I'm under the impression that other serialization mechanisms are possible:

The Player distributed actor automatically gained a Codable conformance, because it is using the SomeCodableDistributedActorSystem that assigns it a SomeCodableID . Other serialization mechanisms are also able to implement this "encode the ID" and "decode the ID, and resolve it" pattern, so this pattern is equally achievable using Codable, or other serialization mechanisms.

This is also reflected by a SerializationRequirement associated type in the actor system, which for the example uses Codable.

ktoso · December 10, 2021, 1:30am

I should probably address core question right away:

This proposal is not going to introduce "typed throws" to Swift.

Swift may or may not gain typed throws/errors in the future, and the runtime of distributed actors is prepared for such situation (if it ever happens), however currently typed throws do not exist and are not really planned.

Errors thrown are implementation dependent.

This proposal does not really dive into the runtime at all (that will be a follow-up proposal), but as a sneak peak I can say that we'd introduce an error protocol like follows:

/// Error protocol to which errors thrown by any `DistributedActorSystem` should conform.
@available(SwiftStdlib 5.6, *)
public protocol DistributedActorSystemError: Error {
}

In order to be able to differentiate errors thrown "by" the system (e.g. encoding error, timeouts, network failures), vs. ones transmitted from the remote side.

Transmitting errors itself is a long topic that we'll discuss more in the runtime proposal, but even then, it is very much up to a system implementation to decide how to deal with them. Transferring actual errors may be an optional opt-in feature, or a system may just return the error type, or no detailed information at all about a remote call throwing -- it is all very context dependent and has impact on security as well, so that's a very separate topic from actor isolation itself, so let's leave it for later discussions.

Codable is mentioned not because of any relation with Apple runtimes, but because it is Swift's serialization story.

The design is specifically optimized to not tie distributed actors to Codable. We focus on it because it is mechanism familiar to Swift developers and we don't want to spend the entire proposal showing all various types of serializers and how they can work. We will spend much more time on how the serialization runtime actually is invoked in proposals covering the runtime aspect of distributed actors -- this proposal only explains the type-checking and configurability aspects of serialization requirements.

Inter-process communication definitely among the use-cases for distributed actors, but it is–by far–not the only one.

You may be surprised to learn that this work originated from networking use-cases, and our experience building distributed systems using actors in other languages. We also recently open sourced a complete distributed actor system cluster (swift.org blog), and have experience with other such systems, like Akka and there's plenty other examples in the industry such as Orleans So yes, the model works well for networking, as much as it does for IPC.

You may also want to check out the talk I linked above [Video] Distributed Actors announced at Scale by the Bay since the latter section of it discusses the failure handling in a real implementation in more detail.

Chris_Lattner3 · December 13, 2021, 1:30am

I have read the proposal, but haven't read any of the pitch threads, here are a few questions/comments:

I am a big fan of distributed actors, but it seems wise to get the core semantic models of actors figured out before accepting this. If I recall, the initialization mechanics are still being figured out. I'm surprised this proposal has been run without that being completed.
Instead of distributed actor Player { should it instead of distributable actor Player { or actor Player : DistributableActor { with a protocol? As you mention, "distributed actors" are often local. The actual property opted into is the /ability/ to distribute the actor. The protocol approach is probably useful to avoid having to implement the DistributedActor abstraction mechanism to constrain things in the generic system. See below on the distributed declmodifier.
The DistributedActorSystem seems complicated, and I don't grok it (but I didn't try hard to). It seems like something that should be its own proposal with its own rationale, instead of being bundled into this proposal. I don't see any rationale for it.
Instead of special casing distributed actor conformance to codable, why not make DistributedActor conform to it? When wouldn't a distributed actor's ID conform to Codable? Should/can this just be required universally to simplify the system?
Do we absolutely need a distributed declmodifier? The actor proposal started with something like that for actor methods, and we refined the design to eliminate it by making all actor crossings implicitly async themselves (and check Sendable). Can we do the same consistent thing here for distributed actors? distributed is very verbose, doesn't seem needed, and should really be distributable or something else similar.

I would expect that distributed actors would want to play the same sort of performance tricks as actor methods do, skipping the async thunk when known to be staying local. In the case of distributed actors, the thunk will be more expensive because of the codable stuff.

I see the alternatives considered section discussion of this: I agree that inverting the default is problematic, but disagree with the conclusion. I don't find the auditability comment compelling at all, since we have exactly the same issue across swift when it comes to API design, that isn't specific to distributed actors. I'd recommend investigating an approach like we do for the base actor proposal, your option #3. I agree with you that the design of protocols with actors is problematic and makes this difficult, I raised these issues when we were discussing actor protocols and it was not thought to be problematic by others at the time. That said, there must be a solution coherent with the existing design.
In actor properties, " Distributed actors may declare any kind of stored property, and the declarations themselves are not restricted in any way . This is important and allows distributed actors to store any kind of state, even if it were not serializable." How does this work if the init runs locally? How does the initialized state get transported to the remote machine?
In breaking location transparency the proposal suggest adding a whenLocal method that takes a closure. IIRC, the base actor proposal rejected a related "runThisClosureWithinTheActor" proposal. We should look at both proposals together to make sure they are consistent IMO. I'd recommend splitting this out to a follow-on, particularly given the note that says the design isn't fully nailed down.
If you're serious about pursuing the local keyword extension some day, then we should make the same generalization for non-distributed actors, generalizing the nonisolated support beyond function arguments.

Overall, I'm happy to see this making progress, it is great work. I'm concerned about premature standardization of this though.

-Chris

ktoso · December 14, 2021, 1:12pm

Thank you very much for reading through and the questions, Chris!

Let's go one by one:

We're trying to slice the problem into a few pieces and it feels like we have enough to discuss here already without mixing the initializers into this proposal as is.

You're right that this proposal does not touch initializers; this is on purpose as SE-0327: On Actors and Initialization has not been set in stone yet, so we're going step by step here, and the method and other isolation pieces of this proposal are separate enough so we can discuss them already.

Having that said, we have thought through and designed initializers as well and this is going to be explained in depth in the second pitch/proposal associated with distributed actors. You can get a sneak peak on it already in this draft PR on swift-evolution: [WIP] Distributed Actors Runtime Proposal #1498. I was just polishing this up and will post a pitch thread so people can give it a look over the break if they wanted to. It won't be up for review until we're done with initializers, this isolation proposal, and it'd be in 2022

Needless to say, I collaborated with @kavon on the initializer's proposal (SE-0327: On Actors and Initialization), so we're in sync about the semantics (and the WIP 2nd proposal discusses these in depth). On that note, if we want to discuss initializers we can do a separate thread for them right now.

In a way there always is some local distributed actor "somewhere", and other remote distributed actor references pointing towards it.

I don't think "distributable" really helps the understanding here. The mental model truly should be "okey, this is distributed, i have no idea if it is local or not", and that's the mindset one has to be in when developing with those types. The moment we know if one is local or remote is the "breaking through location transparency" part, and is a very rare thing.

I'll preface this discussion re distributed the keyword with saying that every time someone new reviews this work this comes up, and after a few weeks (or months) of entertaining the idea we come back to square one that the keyword is both beneficial and necessary.

As this is the official review, I'll definitely re-explain the necessity and tradeoffs of the additional keyword again, although this is what the Alternatives Considered: Implicitly distributed methods / "opt-out of distribution" really did a deep dive into, so I hoped we won't have to re-re-visit this again...

The protocol approach is probably useful to avoid having to implement the DistributedActor abstraction mechanism to constrain things in the generic system. See below on the distributed declmodifier.

actor Player : DistributableActor {

It is going to have its own proposal indeed (the before mentioned [WIP] Distributed Actors Runtime Proposal #1498 that I just finished preparing and will pitch maybe even tomorrow).

There is no need to deep-dive into how the actor system works because all of that is "runtime" concerns, while this proposal is only about the isolation and type-checking aspects of this work. This proposal does mention the type because of its implications to initializers (the necessity to accept it as argument in initializers) and because it is where the SerializationRequirement comes from.

For this review, let's keep it at that; All of the detailed semantics of why and how the system is involved in making remote calls, assigning IDs and more is going to be discussed in the runtime proposal. So the next proposal is now available for you to read, but I don't think it has any impact on this proposal's review other than "there is a DistributedActorSystem and it as a SerializationRequirement associated type".

Hah yeah, so this is indeed where we started off from quite some time ago: DistributedActor was just Codable, and had an implementation provided via extension and that was it. An ID was always required to be Sendable, Hashable, and Codable so that'd be that.

As we discussed this feature with potential adopters one use-case that came up, and would improve upon existing solutions people work with today, was the ability to help developers in a type-safe way with understanding which "services" can be freely shared between processes and which not.

For example, in XPC there are "endpoints" which may be freely shared, and "connections" which might not. But due to the existence of xpc_connection_create_from_endpoint this distinction isn't as clear and it can become messy.

ANONYMOUS CONNECTIONS
The recipient of that message will then be able to create a connection from that endpoint using xpc_connection_create_from_endpoint(). // from man xpc_connection_create

This is very much like an Actor's resolve, however actors operate on a higher level, and we want to hide this from users, at the same time helping them to only "send around" actors which were actually intended for this pattern. In other words, in we would want to actors either an EndpointID or an ID that isn't Codable, and then naturally developers would be restricted from "accidentally" passing around one that was never intended to be passed around.

So this implicit Codable conformance is to support this pattern we were asked for, hope this explains a bit of the background here.

It is a little unfortunate that we cannot just™ write it as

// (not possible)
extension DistributedActor: Codable where ID: Codable { ... }

but that'd be a huge type-system feature I'm told... so we ended up with synthesis for this case, as Codable being specialized isn't unheard of.

Note also that under present Swift no end user is actually able to implement init(from decoder) for a distributed actor at all (!), because of the immutable self in such initializers. I believe I called that out in the proposal and it's something we'd like to lift in the future so then people could actually implement this themselfes if they wanted to...

I really hoped the discussion of this in Alternatives Considered: Implicitly distributed methods / "opt-out of distribution" explained this well enough. I (really really) know that at first it seems similar enough and why-dont-we-just-infer-it but it truly breaks down terribly the deeper one looks at not marking distributed funcs.

Before we dive in here... I was re-visiting this "can we do it without distributed func...?" multiple times over the last years; the last rime time just before writing this proposal, having spent almost two weeks thinking of all kinds of ways it maybe could work and I truly don't think it works out. It always seems fun at first, but ends up breaking down soon enough.

Let's discuss it more though, since it is the official review thread after all:

The problem with the 3rd option from the alternatives considered section–in which the checks are made "as similar as possible as Sendable checking" in that they're emitted lazily and depend on call-sites–is that it means that all functions must emit metadata to be invoked, so every time we compile we need to emit all thunks associated with remote calls, for all functions (and get-only computed properties) of any distributed actor.

Some of these we'll not be possible to derive the thunk implementations for, because e.g. the parameters don't conform to the SerializationRequirement... so we'd fail the synthesis for some of them. So far okey... this is just* emitting a lot of thunks and metadata for functions which perhaps were never intended to be distributed...

* "just too much metadata" – this is actually a deal breaker already for some use-cases we're interested in already IMHO, but let's keep digging.

There are two primary issues to focus on:

we need to pessimistically emit all metadata and thunks for all functions (public, internal, and private (!)), of a distributed actor because it MIGHT be called remotely and we have no idea if they will or not;
- There is no way to optimize out "not used" distributed thunks, because the entire purpose of those is to be cross-process, so we lost our ability to optimized anything "not used"
worse call error user experience:
- callers, by conforming some other type to Codable suddenly would think that "hey, that remote call should work, but didn't resolve the function, why is that?" -- well, the remote peer perhaps did not have the param X as Codable, so we never emitted the metadata for it, so we'd never even attempt decoding, leading to a bad user experience in such rollout scenarios

And last but not least, the prime concern I have with this "inversion of annotation necessity": distribution just isn't the same as sendability checks: Sendability checks are performed within the same program. But distribution, i.e. emitting the "distributed method accessor thunks" means that any such function is remotely callable, and COULD be subject to exploitation. We truly do not want to make a system where accidentally making publicly and remotely invocable things is the norm, this would be a terrible design from a security, and API boundaries standpoint. Access control does not help here at all, if an "internal" func shall be distributed, it really is "as if public" because we can just pretend being a remote call from the same module.

Consider, under the "implicit distribution" semantics, an actor that has a computed property that computes some secure key for a transaction, we'd write:

distributed actor A { }
// ... 
extension A {
  // not intended to be called remotely:
  func getAuthKey() -> AuthKey // AuthKey is Codable { ... }
  // under "implicit distributed" rules, we have to emit distributed thunks
  // for this func, and it becomes effectively public and remotely callable.
  // 
  // It is very hard to notice we just added this as such distributed func
  // since we're even in an extension, and maybe even in another file...!
}

It would be a terrifying world in which I just exposed this API that I thought was internal and local to the entire world to try to poke at and exploit. Of course, there are many other layers of protecting a call, like mutual TLS, or other Kernel level capability mechanisms to prevent access to the connection/endpoint, but still -- we truly should not make a design where the norm is making mistakes and accidentally opening up holes in our applications. We are working with teams focused on closing down such holes, and they have been rather welcoming to the distributed actor efforts so far, and I'd hate to come back to them saying we're adding more areas for accidentally slipping up and making things remotely callable that should not have been.

Sidenote, discussion why even private methods may need to be able to be called remotely. Short version, because of the potential of a nonisolated private func ...() async throws being able to call them. So we'd even have to emit the thunks even for private functions -- and I'm sure people would not expect that and it'd become an attack vector.

There is the issue of auditability too, but that is the least troublesome of them all, though to me personally a compelling one as well.

Oh, absolutely. We do this already; local calls incur no transport/serialization overhead.

Again, as this is a runtime concern it is not discussed in this proposal but instead will be covered by the runtime proposal/pitch that I linked above. If you want to read ahead this is covered in the runtime proposal's (NOT THIS PROPOSAL) Invoking Distributed Methods on Remote Instances section. When it is unknown at compile time if the instance is remote or not, we invoke a thunk which checks, and if the instance is local invokes the local func directly. There are no additional suspension points or any other serialization overheads in this case. If it was remote after all then we invoke the remote call infrastructure.

Initializers are always local. There is no such thing as "initialize this actor over there". As such, there is no "transfer this state", there is always just messaging (distributed methods).

Sidenote: we had attempted this at some point in Akka and it was called "remote deployment" and it was a terrible mess and bad idea. Though there mostly because of the semantics associated with waiting for initialization, as well as versioning associated with this.

This is the same how other runtimes deal with this. It is important for a distributed actor to be able to hold state that cannot be serialized and sent around. They are like exposed endpoints that manage such state contained to a node after all (e.g. connections, file handles etc).

If you wanted to initialize a worker on a remote node, you do it through another actor that serves as a factory:

distributed actor GreeterMaker { 
  func makeMeAGreeter(something: Something) -> Greeter {
    Greeter(something: Something) // init is "local"
  }
}

let remote = try GreeterMaker.resolve(...)
let greeterOnRemote: Greeter = try await remote.makeMeAGreeter()

try await greeterOnRemote.hello()

"Remote" initializers would open all kinds of cans of worms that we don't want to real with as well, most notably: what would be the lifetime of such "remotely created" actor? Since Swift relies on reference counting, and the init is returning "the only" (at first) reference to an object... what would that even mean for a remotely "initialized" actor? We'd either have to build in a way to manage lifecycles associated with them into the initialization somehow, or make other promises about the lifetime -- neither of which are things I want the language to get into.

With plain old distributed (factory) functions it is simple: the init is always local, and it is up to the function to either store or otherwise manage the lifetime of the distributed actor it returned and is about to return to a remote peer.

I remain convinced that the "remote deployment" path is not something we should pursue and, most importantly, it is not necessary for any specific use-case since everything it achieves we can do without it, and cleaner. Introducing it would cause a lot of complexity to already complex initializers, and I'd be very worried about supporting them in various actor systems based on past experience, and Swift's unique problems with how actor lifecycles work (tied to refcounting, and no, we should not implement distributed ref-counting ).

Let's take the last two together:

Yeah this could be fair to tackle separately perhaps. I thought it was important to outline this capability as it is fairly important for some use-cases, but as we now have real runtime support for invoking distributed methods, we don't need it for just implementing the cluster and other libraries I think

Sidenote: We actually had to provide this during our initial port of the distributed actors cluster library because without it the source generation based implementation would have been blocked somewhat. But now that we're working on the distributed method invocation support in the language, I think we could survive without this for the time being.

I'd love to write the "right" whenLocal, but to do that we'd need the local in the type-system. We discussed it a little bit with @Douglas_Gregor but should probably revisit how hard and how far out doing the "right" thing here would be. It is true that it is quite similar to what happened with nonisolated and isolated parameters - that we generalized them some more.

Uff! I hope I covered all questions, even though a few of them really are asking about runtime concerns which are outside of this proposal and defined in the next.

Maybe if we want to keep digging into the runtime details, we can use the thread I just made for that side of things (and associated pitch): [Pitch] Distributed Actor Runtime? We'll see how the discussions flow I guess.

Thanks again for all the feedback and let's keep it coming; I'm sure we'll arrive at a satisfactorily design in the end

Chris_Lattner3 · December 14, 2021, 7:58pm

Ok, that makes sense. I was just pointing out that there is a logical dependence graph of tech here.

I think it helps to be specific about terminology here: a "local distributed actor" doesn't make sense to me. I think you're referring to the "actor proxy" or something when you say that? I was referring to the issue that not all distributable actor instances are remote. If they are local and in the same address space, then you avoid all the coding overhead etc, and they are not "distributed" at all.

Just to clarify, I'm not suggesting anything about breaking through the abstraction. You're right that that means that code needs to be written to handle the distributed case, but the adjective distributed isn't conveying that.

In my view, the capability this adds to the type system adds is the ability to distribute instances of that actor type, not a requirement that all instances are distributed.

Independent of the spelling, I don't think you addressed the question re: "declmodifier vs protocol conformance". Why is the declmodifier on the actor type required at all? it seems cleaner to opt into distributability by /conforming the actor to a protocol/, which is how we add constraints/capabilities to types. We use "well known protocols" in other places for capabilities like this. Conforming an actor to the DistributableActor protocol seems simple and clear way to enable this functionality.

I agree with you that internals of the runtime are important for an evolution proposal, but you're adding all of this so we can have additional flexibility and capabilities down the line. These aren't runtime concerns, they are aspects of the programming model you're proposing.

Just to clarify, I'm also not suggesting we add the complicated generics feature, and I'm not suggesting that you get rid of the DistributableActor protocol. I'm suggesting you unconditionally conform DistributableActor to Codable.

Rationale: such a design would be simple and cover the majority case. You've expanded the design surface area to cover more cases, but at the cost of additional complexity. Is this one potential use-case
really a show-stopper that justifies the complexity?

If you think you've got the right design point, it would be great to dive into your rationale, perhaps in the "alternatives considered" section.

No, as I mentioned in my post above, I'm suggesting a completely different design than you're discussing in that section, thank you for diving into it:

I can sympathize :-), but I think it is also important to take this consistent feedback as a signal that it would be high value to fix this. I'm not sure what others have suggested about this (besides complaining about verbosity), but I'm suggesting a design/implementation approach which is different than what you covered in the alternatives section. As such, it wasn't clear to me that you explored this approach.

Also, as I mention below, I personally don't agree with your viewpoint here. Furthermore, I'm not just complaining about the decl modifier. The bigger concern I have is that you are unnecessarily complicating the design and making distributed actors behave unlike existing actors. The declmodifier is the "symptom" of the problem, not the core problem itself. I'm concerned about more than just ergonomics.

I'm not an expert on how this works for normal actors, but I wouldn't be surprised if this is how the equivalent async thunk generation works. I'd put that burden lazily on the caller side, not on eagerly on the implementation side.

Regardless, your objection applies equally well to the existing accepted and shipping design for actors. I don't think see that "code size" is a strong motivator to bifurcate the language design. There are lots of ways to combat the code size problems a trivial implementation would have - I'd suggest a fancy "more dynamic" implementation of this mechanism than just inlining all the codable stuff.

As I mention above, this is in fact "not" required.

Agreed: my proposed approach is consistent with how sendable checking works, which has exactly the same issue for exactly the same reasons. We discussed all this back in the original actor design process and agreed that the QoI is really easy to handle: diagnose that the method isn't callable because the type isn't codable.

I don't see your concern here as being different from what we've already accepted for core actors, and certainly not enough to bifurcate the language model.

This sounds quite dependent on the implementation model, but isn't the natural thing to tie into just standard access control? After all, this is how actors (and all other types) work, and keeping this consistent in the language is better than adding a different behavior tied to a declmodifier.

Swift has exactly the same sorts of concerns with dynamically loadable libraries, with packages built by 3rd parties that are incorporated into other apps etc. I don't see anything specific to distributed actors in your concern, nor do I see a rote declmodifier as an effective security apparatus! The compiler will just be telling people to add this, and their brains will tune it out.

I do think that security is extremely important for this feature, but I think that it deserves proper consideration - I don't think the declmodifier "solves" the problem in any way. We need something like the API audit tools, runtime features, etc to help with this. I think it would be great to have a specific proposal to talk about this.

My point is that it is beneficial to align the language model and the runtime model, this is what we discussed in the base actors proposal.

Ok, I was confused what "local" meant. I though you meant it was local to the machine that invokes the initializer, but I think now you mean it happens on the machine that holds the instance. Independent of the language feature, I think that clarifying the terminology here is really important for people to understand this.

It sounds like like codability checks and transports are happening on the initializer arguments. That model makes sense to me, but I didn't get that from the proposal (probably my fault).

It is challenging, but it would be helpful to understand the longer term view when evaluating immediate stepping stone proposals like this. It will help us keep the language consistent over time, instead of ending up on a random walk

Thank you for all your work on this @ktoso!

-Chris

Joannis_Orlandos · December 15, 2021, 2:40pm

I love the idea of Distributed Actors. They can provide a tonne of convenience and reduce a lot of complexity in designing distributed systems. But I really want to join @Chris_Lattner3 's statement in that the longer term view(s) really would help to understand a proposal such as this.

As an example, you mentioned that there's no support for transferring state elsewhere. While that's not something I've admittedly thought about, it's really good to underline and rationalise these design decisions. Because I could've very easily expected this to be part of a distributed system.

A cluster, which you do briefly mention in a bit of pseudocode, without the ability to transfer the lead would be problematic. But a game server that I set up with two friends usually doesn't require me to transfer the lead when I go offline. These two systems are not comparable, and I think this alongside many other details cast a bit of fog over the whole picture. So I feel that it makes reviewing this quite a bit harder.

Codable
While I strongly advocate for distributed actors to support Codable, I can see the perspective of not supporting Codable as well. I have no hard feeling either way, so long as Codable is supported at a minimum.

Keywords
I really don't like the idea of omitting the distributed keyword. The code reviewability is a really strong argument for the keyword. Realistically, if a function is implicitly distributed, most developers won't think twice about this, and will rarely or never add a local keyword.

API Versioning
As you mentioned, adding new parameters to APIs can be quite a painful transition. If not all members are updated simultaneously, especially in client-side apps, I can foresee some trouble maintaining these implementations.

I think it's wise to put some specific attention towards that. Since it's going to become very common once/where distributed actors are adopted. I honestly don't think we can find an 'ideal' solution for this, but I think it should at least be researched and documented as part of distributed actors. Maybe we can version our clients and individual distributed funcs, with a minimum and maximum API level.

Access Control
Not all clients are equal, especially in a game environment. Now I realise this is primarily targeted at server-side development, but there are a lot of use cases for iOS and macOS. I think a large portion of this can be implemented on the transport level, but I don't think that's sufficient when running business logic. I wouldn't want a random player executing "admin commands" on my server's save game, by imitating my username.

State Transfer
Depending on your use case, transferring state in some form is a basic requirement. In most use cases that I can think of, I'd likely want this state to be recorded regardless. Most states aren't solely ephemeral. So I think storing/resuming state, and transferring state, is something to be at least considered. With a local actor, I can put all of my state in a single Codable property that I write after mutations. For a distributed actor, this won't be as simple as with a 'local' actor.

Overall
I don't see many problems with the discussed APIs, statements and solutions. I do think a few main topics to research/clarify are: Access Control, State Transfer (or the lack thereof) and migrating APIs.

ktoso · December 17, 2021, 12:53pm

Sorry for the delay in replying here! Back to it with full attention now though

I use those terms to mean very specific things, but you're right the proposal didn't define them, that's my bad and fair to add the definitions (I'm happy to add explicit definitions, proposed clarifications here).

I did define it more in the talk ([Video] Distributed Actors announced at Scale by the Bay), but it should not be required to watch this and the proposal should define everything.

Let's define it here and I can add clarifications to the proposal text if necessary:

a distributed actor is what we interact with in the source language all the time
a "local" distributed actor, means that while in source we treat it as distributed, we actually have a local instance in hand;
- we don't know if we have a local instance in hand (e.g. in a variable) unless we use the whenLocal facilities
- this is what we talk about when we say "known to be local distributed actor" but the short form of that is just "local distributed actor"
a "remote" distributed actor, is the "proxy", it has no storage allocated other than for the id and actorSystem
- we don't know if we have a remote reference in hand (e.g. in a variable) unless we use the whenLocal facilities

I don't view the "distributed local" phrasing as weird; it is just that distributed means either of two instance types at runtime: local, or remote; While at compile time we concern ourselves mostly with the fact that an actor is distributed, so we don't know and assume the remote case. It would be wrong to call such actors remote though - they may or may not be such.

I was referring to the issue that not all distributable actor instances are remote. If they are local and in the same address space, then you avoid all the coding overhead etc, and they are not "distributed" at all.

Sure, but statically, you don't know (except the "breaking through location transparency" mechanisms) and are not supposed to know if you're programming with a local or remote one.

This is a core principle of programing in such systems: I write my algorithm against "some worker" and when I run it in my tests, all the workers are actually "local" but when I actually run it in the target environment, they are all (or just some) remote actors. We program "with distributed actors", and "remote/local" is a runtime property, that for what it's worth can even change depending on configuration of the system etc.

The fact that from the perspective of one process a specific actor is never actually remote, does not make it any less distributed: distribution that we could have shared it with other nodes/processes, and therefore they may attempt to invoke things on it, so the distribution aspect always matters really. It's truly better to always think "distributed" when working with these types.

It's true, the distributed marker means a call can be remotely accessed. But that's how we specifically use those words: distributed is the "i don't know if local or remote", and the remote word is reserved for "i know it is remote". This is not unlike other uses of this term in many other distributed actor implementations out there.

I don't want to dwell on the naming too much right now, and let's move on to the more specific design questions, but last thing to note is that if we were to follow that "-able" logic, we should have named async to asyncable because an async func may not ever suspend or do anything asynchronous at all.

This is the same as async means "may be asynchronous, so treat it as such (but you may get lucky and it was quick to return synchronously after all)", and distributed means "may be remote, so treat it as such (but you may get lucky and maybe it was local after all)".

Let's continue to other questions though, there's a lot here other than naming to look through still:

This again is a very meaningful distinction and comes back to not allowing to break the isolation model.

If we said: actor Worker: DistributedActor were the way to mark these and implement this with normal Swift rules, that it's an Actor & DistributedActor now, we can end up in the following isolation violation:

extension Actor {
  func f() -> SomethingSendable { ... }
}
// and then...
func g<A: Actor>(a: A) async {
  print(await a.f())
}
// and then...
actor MA: DistributedActor { // : Actor implicitly, because `actor semantics`
}
func h(ma: MA) async {
  await g(ma) // allowed because a MA is an Actor, but can't actually work at runtime
}

// there's other examples that end up showing the same pattern though

So... it really isn't an Actor. It is a DistributedActor. They share implementation details (specifically, the "distributed local actor" has the same internal runtime representation as an Actor, but for isolation checking purpose reasons, they are not related).

This is also why protocol Actor: AnyActor {} and protocol DistributedActor: AnyActor {} if we want to keep the same top-level parent protocol type for both.

We can of course argue about keyword vs. "special magic protocol that removes the Actor conformance" etc, but to me that seems rather hacky and unprecedented; whereas acknowledging that they're not the same, by declaring them differently feels more close to reality. (We did look into these relationships for a long time, DA refining the A or the other way around do not make sense because of the conflicting demands on isolation checking).

I'm not sure how to address this other than "it would be one mega proposal" which we found is not stomach-able by reviewers, thus the split of "isolation" (this) and "runtime" proposals.

Totally agreed that it would be a simpler model, and in clustering indeed that's what worked well for us. But as I said: this serves a specific request we received from xpc and security teams involved in reviewing this work. We found a few things that are not so great today and we'd like to utilize this effort to improve them. This is specifically lifting mis-uses of APIs into preventing those mis-uses into the actor types simply not being Codable, and thus not able to be send to other distributed actors "by accident".

Agreed that adding this as alternative considered makes sense, I'll add that to my "additions" PR in the morning and notify here. I think my writeup from the previous post could be used for that section.

Sure, that's how async thunks work, but... we have no knowledge of caller-side at all at compilation time of a distributed actor (!).

The purpose of distributed actors is to cross process boundaries. A "service" will be compiled entirely separately, form what "clients" might end up calling it, so in this case we're crossing module boundaries so anything public has to become implicitly distributed (in a distributed-keyword-less design), which by itself is troublesome IMHO because we won't notice we forgot to make an API actually callable because we forgot making some parameter Codable for example. We perhaps would not even notice in our testing, if we kept using "known to be local" actors in our tests (though today this is not possible, but the introduction of local would cause that).

This was the trivial scenario though. The more interesting and very important one is peer-to-peer systems where we are not crossing module boundaries. I might declare a distributed actor with an internal func futureFeature() and I'm not using it in my app in version 1. The feature though exists and is perhaps even implemented. If the func was not used... you'd propose to optimize away emitting the thunks for it - so we cannot invoke it. Now, we roll out version 2 of our app and we're announcing the "futureFeature" and actually even v1 processes can support it, it was just some "from December 24th, download an update and it unlocks the feature!" The new version of the same app, knows that is has the func futureFeature() and we know it is implemented and ready; distributed isolation under this implicit model allows us to call this func... and yet... the v1 will never invoke the target, because we're missing the thunks to do so because the fun was "never used" before! This is a terrible pitfall, we allowed compiling things which look like they would work, and they should work, and yet they won't.

So... to avoid such nightmarish pitfalls (as if there wasn't enough pain-points with evolving distributed protocols) we're forced into emitting thunks for every single func of a distributed actor. This isn't something we work-around from in the implicit model. The only solution is "developers must mark everything as local that definitely is not distributed", and approaching the annotation problem from this side is just putting the cart before the horse really. No-one would go around auditing their code and adding local to all functions "just to be safe", while there is a lot of incentive to mark methods distributed because if I forget to do so, what I'm working on right now won't work at all, even in my tests, so I am very much led to do the right thing here. A nice bonus is getting the codability checks on declaration sites too, but that's just the cherry-on top of the right semantic model.

I hope this example explains why we'd be forced into emitting far too much metadata and thunks.

The difference is quite large because we're talking about separate processes talking to another, without any prior knowledge of each-other, other than the type declarations. The example above goes through this step by step, but to summarize:

This isn't about the type-checking, this is about the ability of the callee to actually receive and invoke an incoming invocation. In the implicit distributed func model, we are forced to be pessimistic about it because everything might be invoked. It is not a desire-able design neither from expressing developer indent, or metadata footprint perspectives.

This isn't as much about "the declmodifier solves it", as it is about "an opt-in model is by default safer, smaller in footprint, and expresses developer intent much cleaner".

Absolutely agreed on tools; and we'd built those on top of the emitted metadata; this metadata exists regardless of which (implicit / "opt-out" or explicit / "opt-in") model is adopted. I will say though, it is much easier to audit a set of explicitly made distributed methods rather than "well, basically every func of any distributed actor".

I hope the definitions in the beginning of this reply help a bit, but we can dive deeper into this because I sense there still is a misunderstanding here:

There are no SerializationRequirement checks applied to initializer arguments. Only the usual Sendability checks that the Actor Initializers: Sendability proposes in the under-review proposal right now.

To illustrate this with an example:

struct DatabaseConnection: Sendable {...} // NOT Codable
distributed actor Worker { 
  let db: DatabaseConnection
  init(db: DatabaseConnection, system: ActorSystem) { self.db = db }
}

We absolutely want to be able to have distributed workers which accept and store not codable arguments in their initializers. An initializer is always "local" in the sense that the process that executes Worker(...) is where that instance resides. There is no "initialize a worker on another node" performed by initializers.

We, could, though if we wanted to have an actor that accepts such "give me a Worker" requests (from remote peers), and returns a worker, like this:

distributed actor Service { 
  let db: DatabaseConnection
  // ...
  distributed func getWorker(id: ...) -> Worker { 
    self.cachedWorkers[id] ?? Worker(db: db, system: self.actorSystem)
    // simplified, we'd store the new one etc...
  }
}

// === meanwhile on a different process -----
actor Logic { // anything really
  let service: Service
  func calculate(...) async throws {
    // oh no, heavy calculation, let's get a remote worker
    let worker = try await service.getWorker(id: ...)
    try await worker.work(...) // worker is remote in our example
  }
}

So, we're able to "create" a worker on the remote side-on demand, but with collaboration of a Service that creates them however it sees fit. And most importantly, the workers have non-serializable internal state -- we never ship the state to the Logic on the client; the client only got a remote reference to the worker after all.

Thank you very much for diving into all these topics! It is excellent to bounce those ideas and I'm sure we'll get to a great design we're all comfortable with

ktoso · December 17, 2021, 1:16pm

Thanks for the review @Joannis_Orlandos!

Yup, that is covered by the current design. Codable is supported, and equally we can support any other serialization mechanism. Codable may not be the best thing for every scenario, but we're very aware many Swift developers are very familiar with it, and its ease of use in Swift is very valuable

I suspect most implementations will lean on Codable, unless they have specific performance, serialization format or performance requirements it is not meeting today. To that end, we're also interested in improving Codable and the future of Serialization in Swift (though we've been a bit slow to pick up the topic, see thread from a while back: Serialization in Swift).

Right yeah, thanks for chiming in to confirm this aspect of the "implicit" counter-proposal.

Yes, API evolution is tremendously important but I didn't want to dive into it during this proposal which focused only on isolation aspects. We will keep working on this aspect of distributed actors and more features are to be expected to solve this piece more and more in the future.

You may enjoy reading about one aspect of API (and ABI) evolution features we're interested in adding here that I posted in the "proposal #2" that covers the runtime aspects of the distributed actor system: Distributed Actor Runtime: Stable names and more API evolution features. Feel free to use this pitch thread [Pitch] Distributed Actor Runtime to discuss this aspect of the design more

There also exists some prior art with specific versions attached to distributed actors in Orleans (Grain Interface Versioning | Microsoft Orleans Documentation) though this mechanism may end up being hard to implement in the "general" sense in the language (it heavily relies on EXACTLY knowing what the exact cluster implementation does). But I'd be interested in exploring this direction as well.

I understand the request, but again, this isn't something for the current review about actor isolation.

I can let on a sneak peak that we're interested in annotating things with some form of "this func needs some permission" but this isn't really designed... I would want this to not be a distributed actor specific feature but only a way to annotate functions and for their invocation to check (authorize) based a value in task local storage. We haven't designed this yet, and not sure when we will, but it definitely is high on the list of wanted features.

To clarify, because "state transfer" is a bit of a too general term.

The language feature distributed actors does not "magically move this alive actor to another instance, along with its state" without some external mechanisms. A lot of this is achievable with protocols and libraries that operate on distributed actors. You're right that such "sharding" / "balancing" is a killer feature of such runtimes, but it is solved at the library level as it can vary wildly how this is achieved. If you're curious to read up how other runtimes do this, the keywords are "Akka cluster sharding and persistence" and "Orleans Stateful Grains".

The language design here should enable stepping up and implementing such things in libraries in Swift

It's exactly the same really

Since you're implementing that actor's logic... feel free to invoke some db.persist(state) after mutating it. You can also restore such state during an asynchronous initializer even (init(...) { self.state = db.load(State.self, for: self.id).

There's some exciting framework (just in open source packages really ) opportunities to work on these things, but we won't be proposing such into the language I think as I don't think we really need anything special here

Joannis_Orlandos · December 17, 2021, 1:27pm

Hey! Thanks for the reply. I think the main thing for me to clarify is that I don't expect any of my use cases to be language features. But I would like to highlight them, because I would hate for some API design decisions to spawn a 'closed' door that would prevent such features from being implemented later down the line. That being said, I don't see any closed doors spawned here. But I also don't think I would spot one without a clear picture of the 'grand plan'.

Over all, I think we're on the same page here. And I'm strongly in favour of adding this.

ktoso · December 17, 2021, 1:32pm

Right, and thanks for speaking up about the envisioned use cases! So far at least I definitely am thinking about each of the things you have mentioned so far Thanks again!

bobergj · December 20, 2021, 4:36pm

Thank you for clarifying that.

It's just that I would expect the proposal to be mostly self-contained. Also, while I understand that many of the details will be filled in by the next proposal, I feel like this should have been proposal number two, and there should have been a first proposal with the proposed programming model - without going into syntactic details.

With regards to the programming model, so far with this proposal, we've seen that the distributed method is the primitive that two distributed actors use to communicate. But there's also an example, under Future Directions, of receiving a message with onReceiveMessage.
There's a mention of Erlang and Elixir in Acknowledgments. In Erlang the primitive operations are sending a message, and receiving a message. In BEAM there are also monitors for monitoring a remote or local process. Those primitives allows expressing various communication patterns, not just request-response, but also, for example, single-request-stream-of-responses and cast (don't care about response).
Am I missing something here, and such communication patterns can also be realized with a distributed method?
In the non-distributed actor "toolbox" we have AsyncSequence and AsyncStream, but at first sight I don't think those would be applicable to distributed actors.

ktoso · December 20, 2021, 10:07pm

We worked with the core team on staging these proposals and arrived on this order. The first pitch contained "everything" and was far too much to take in for reviewers. I don't think we can propose just "the programming model" without discussing actual syntax to be honest.

I do feel however this proposal is self contained, and we're actually not discussing anything it focuses on which is a bit disappointing... There is a lot of details about protocol conformances and how distributed isolation affects those we could dig into here, none of those really relate to runtime and the split of runtime and compile time checks does make sense to me to be honest. This proposal enables any kind of programming model and runtime to be proposed formally next after all.

Continuing with comments regarding runtime aspects though.

This need not be part of the language per se and is to be implemented in libraries, which is why it is not going to be covered here (and definitely not in the isolation proposal for what it's worth).

I explain how this is implemented in the cluster announcement: [Video] Distributed Actors announced at Scale by the Bay and you're free to review the implementation in the cluster reference actor system: https://github.com/apple/swift-distributed-actors/blob/main/Sources/DistributedActors/LifecycleMonitoring/LifecycleWatch.swift and how it uses SWIM for the monitoring nodes.

This isn't a "receive block", it's just a local method called like that.

There is no "receive block/loop" being proposed in our model. (We had this for a long time, while we didn't have any language support for making the calls, and decided to simplify the model and offer distributed methods).

This isn't unheard of, exposing a simple "method based" programming model is what made Orleans successful.

Again, this is jumping far ahead and not related to actor isolation... We can take this to the Distributed Actors - Swift Forums forums category if you want to discuss streams and distributed actors.

The short answer is: yes, this is doable and I've built just that in my "previous life", having worked on reactive streams which combine/async-sequence is based on conceptually, as well as akka-streams. See here for the documentation of the feature I implemented over there: StreamRefs - Reactive Streams over the network • Akka Documentation - a stream ref is in Akka terms "send a (reactive stream) endpoint to other distributed actor". In Swift it'd probably look a bit simpler, since we have distributed methods and "just" AsyncSequence, but the concept remains the same.

Again, this is something I'm keenly aware of, but we can survive just fine enough without right now.

The concept of "send" also known as "don't wait" sending messages to actors is coming up very frequently in various proposals, even on local actors. It is another future direction to express "oneway" or "send" things for distributed actors, but again, this does not matter for the isolation aspect of the design.

hassila · December 21, 2021, 8:44am

What is your evaluation of the proposal?

Overall +1 - but as with Concurrency, it is somewhat challenging to get the full picture of these bigger pieces.

Is the problem being addressed significant enough to warrant a change to Swift?

Yes. Making implementation of more complex distributed systems a lot easier is a big win for server-side Swift.

Does this proposal fit well with the feel and direction of Swift?

Feature wise yes.

Not having a strong opinion on whether to use decl modifiers or protocol conformances for the actor, but would point out that we would need decl modifiers for the functions (and properties) that should be accessible remotely, so feels reasonable with the decl modifier.

If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?

I've used NeXT PDO extensively which shares many similarities.

I've also later been part of implementing custom messaging solutions for two separate systems with similar functionality since then.

In general, I think this proposal (in conjunction with the others) have many nice feature and generalises a lot of the problem domain in a nice way.

I do miss 'oneway' capability of message passing and more work on API evolution / compatibility checks, but there are a fair amount of discussion there in Future Directions and obviously the team is aware of those problem areas.

I am slightly concerned about performance in general, but haven't had the time to prototype and verify/test how it looks.

I really like the care that has been taken to support multiple actor systems and the related functionality with what looks to be nice hooks for custom implementations as well as the convenient module-wide defaults and synthesised initialisers that remove a lot of boilerplate.

To have built-in support for Codable, but being able to scale up to more specialised and efficient serialisation strategies is also very welcome - allows for quick prototyping and to spend time there only when required (and will surely be good enough for a large majority of cases).

How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

I've first skimmed through the proposal when originally posted, and now spent two hours reading through it fairly carefully.

Some additional comments:

Complete isolation of state

I am not convinced that the complete isolation of state is really required with the argument that it may be unexpected network round trips and would like to lift that restriction.

The try await marking required to access such properties is a fairly big warning that this is an expensive operation. I understand the desire to try to help users to not shoot themselves in the foot here, but I think also it is a matter of convenience (not having to write boilerplate distributed accessor functions) and makes the transition between pure local actors and distributed ones easier - and consistent with distributed computed properties.

I'd argue that it might be better to focus on appropriate analytics (like introducing span tracing) so users understand the behaviour instead - as well as documenting a performance guide best practices with e.g. batched accessors - it'll surely be a useful thing for more scenarios. Also as mentioned in the pitch, IDE:s can provide guidance if desired. At the end of the day, regardless of how transparent and packages distributed actors are, you must take some consideration into the distributed nature of the beast if you want to achieve good performance.

Minor typos that can be found with search/replace

"nonisoalted async throws isHuman" -> nonisolated ....
"typealias on the actort" -> ... actor
"form of best effrot" -> ... effort
"nonisoalted func may be used" -> non isolated ...
"protocols need not be so struct" -> ... strict
"does to existential types not conforming" -> due ...
"ditributed method calls" -> distributed ...

Chris_Lattner3 · December 22, 2021, 9:50pm

No worries, I also have other things going on Happy holidays btw!

Sure, but as this is a formal review thread, naming is quite important.

I'm not sure how you see that. async isn't a type modifier, it is a decl modifier that acts as an adjective when applied to func and to function types.

Here we are talking about something that is fundamentally a capability of a type, and which is protocol'y. Swift pervasively uses the 'able' suffix for those things. As I mentioned, there is no need for an attribute or decl modifier here, using the protocol itself is sufficient.

Wait, you're making the claim that your distributor actors are not Actors, that they do not conform to the Actor protocol? How could that possibly work? This wasn't clear to me at all in the proposal, and this seems like it is going to cause a huge fork in the swift type system.

Regardless of whether this is a keyword or protocol, this seems like a quite significant issue.

I can sympathize, but from a procedural perspective, you are driving a very complicated proposal through swift evolution without motivation for the complexity. Splitting into a simpler base proposal, and then justifying the complexity in a follow on proposal makes sense to me.

As you know, language design work is all about balancing tradeoffs, and it is very difficult to understand the complexity impact when considered with a shifting base proposal. Splitting it out to its own standalone design would be best. If that is impractical, you could expand this proposal with a big alternative considered section that explains and motivates the critical uses cases, why the complexity is necessary etc, but that ends up being akin to a proposal within a proposal.

Right, I understand this. My point is that this is fundamentally the same as shared libraries that have separate compilation. There isn't a new point in the design space, so we should keep the design consistent with what we already have.

I don't find your arguments in the post convincing, but I don't have anything to add other than what I already stated upthread. My macro concern is that you're introducing a huge amount of language complexity for a niche feature. I'm a huge fan of distributed actors, but the more bespoke complexity you add to support them, the less appealing the proposal is.

I personally don't support this version of the proposal given the amount of complexity you're adding to the language, type system, and standard library.

-Chris

hassila · December 23, 2021, 5:36am

If using a protocol marker for the actor to make it distributable as suggested, would distributed funcs still be marked according to proposal, or something else?

hassila · December 28, 2021, 9:08am

Just a short comment on this regardless as I'm currently reading through the actor runtime proposal.

Even though it's understandable why the nomenclature has ended up where it is now, I agree it is fairly confusing reading things like "remote distributed actor" or "local distributed actor" (even if I understand what is meant, these seemingly conflicting words makes it harder to understand and causes some cognitive dissonance).

Iit is bike shedding, but as mentioned naming is quite important - it might be a bit late, but perhaps it'd be better to avoid distributed&distributable at all and I would suggest considering using the location transparency as a way to describe this instead.

transparent actor myActor {
or
actor myActor : LocationTransparent {
(if using protocol annotation instead as suggested)

transparent func myFunc()
or possibly
@locationTransparent func myFunc()

I'm not a language lawyer really, so just trying to convey the general idea to move from distributed&distributable to location transparency as the describing language will help things read better and be more easy to understand.

Then the example above would spell "remote [location] transparent actor" or "local [location] transparent actor" which at least to me causes less dissonance.

xwu · December 28, 2021, 11:19am

I'm late to the review as it's quite a hefty document to go through.

I do think the issue of distributed actors is a significant enough topic to warrant language-level support and overall support the proposal as it stands. I read the original pitch and made notes that I never had a chance to polish up for posting, but looking back at those notes on studying this revision I see that essentially all my prior concerns have been addressed and quite thoughtfully.

With respect to the review comments that precede me, I do feel that explicit marking of distributed functions is important enough to warrant the design as proposed: it is on the same level as nonisolated (and indeed mutually exclusive with it) in terms of affecting what can be written inside a function. Moreover, the requirement for explicit marking is of-a-kind with Swift's existing design that defaults to internal access while requiring public to be explicit. I am totally on board with the explanation both distributed is a form of "exposure" just like public but on an axis that's distinct from access control. Therefore, I agree with the treatment proposed.

Furthermore, I agree with the proposal in just sticking with "distributed" as the adjective for the overall feature. Given the primacy of location transparency in the design. I can readily understand viewing local distributed actors as "trivial cases" of (self-)distribution, with the terminology emphasizing that all are treated statically as distributed regardless of whether they're local or remote.

I do agree with @Chris_Lattner3 in his feedback, however, that a distributed actor being not-an-actor but DistributedActor refining AnyActor and not Actor is quite a bit confusing. Some more revision here may well be warranted.

A few specific points on this proposal rather than the overall feature:

associatedtype ActorSystem is constrained to types that conform to DistributedActorSystem, yet the document claims that the requirement can be satisfied with an "existential actor system"—but existential types cannot conform to any protocols, so how?
Why is nonisolated var actorSystem of existential type DistributedActorSystem rather than the concrete associated type ActorSystem?
The proposal now allows distributed get-only properties, yet prohibits distributed get-only subscripts in part with the rationale that it has the "same problems as properties"—so, it would seem logical to allow these also?