[Pitch] Distributed Actors

ktoso · September 2, 2021, 9:53pm

I don't feel like it's necessary to use up a name here. All that matters is that there is an ActorTransport provided, we don't really care how you name it, it could be useful for you to use different names:

let devCluster = Cluster(...) // : ActorTransport
Worker(cluster: devCluster)

etc.

Can you provide an example of what you mean? It has to be passed at initialization time, there's no real way around the fact unless you're hardcoding a specific transport.

There is a potential future direction in which we may want to allow declaring the property as

distributed actor DA {
  nonisolated let transport = HardcodedDontDoThisPleaseThough() 
}

that is a bad-idea™ though as it makes it hard to swap the transport and make use of a different one in testing, so I don't think this is such a great thing in reality. It also is difficult to share a nicely configured instance this way without having to use globals. So this isn't really that good of an idea, even if we'd allow it in the future.

ktoso · September 2, 2021, 10:24pm

That's another reason we rely on the accompanying source-gen's today -- you can go wild there, literarily zero copy if you really wanted to -- but then your actors would need to accept whatever serialization buffers or types you'd need to achieve that. You'd be sidestepping Codable entirely as well.

If/when we pursue the Envelope<> synthesis without source generation, a zero copying approach might become more difficult. But you can trust that we're very interested in allowing this to be high performance solution

I think it's worth keeping in mind that while Codable has performance limitations, we can and will keep improving it. It is a "good default" because it is convenient and good enough for most cases. People with more specific requirements should be able to do more specialized things if they need to (which is why opening up that SerializationRequirement typealias perhaps), but Codable is totally fine for most applications.

Saklad5 · September 2, 2021, 10:43pm

Could distributed actors be used to build a Swift-first successor to Core Data (or any other object graph)? It seems like you could implement a damn good NSManagedObject replacement this way.

After all, Core Data was always meant to be backend-agnostic, and almost every problem it has seems to have a superior solution in Swift at this point. Migration aside, anyway.

Saklad5 · September 2, 2021, 10:58pm

Is it really necessary to prohibit distributed computed properties? I understand the reasoning, since it might be very expensive to access, but I’d still rather avoid an explosion of method getters and setters. Computed properties combining the two makes APIs feel far less cluttered.

Swift’s current approach to deceptively expensive computed properties is simply cautioning against them and encouraging scrupulous documentation. I think that should be sufficient for distributed actors as well. Frankly, it’s bad practice to use properties without checking time complexity anyway.

Saklad5 · September 2, 2021, 11:09pm

Why doesn’t DistributedActor refine Identifiable? It obviously meets the requirements.

ktoso · September 2, 2021, 11:10pm

It does.

public protocol DistributedActor: Sendable, Codable, Identifiable, Hashable {

See: https://github.com/ktoso/swift-evolution/blob/distributed-revised/proposals/nnnn-distributed-actors.md#initialization

Saklad5 · September 2, 2021, 11:11pm

Ah, never mind then. I missed that somehow, sorry.

Saklad5 · September 2, 2021, 11:19pm

Putting aside attempts at constructive criticism for a moment, I must say this is one of the most impressive proposed features I’ve ever seen. It’s the product of an immense amount of work since Swift’s inception, and it seems to rule out a staggering array of logic and coupling errors without sacrificing versatility or clarity.

nikitamounier · September 2, 2021, 11:39pm

I'm incredibly impressed by this pitch – I saw the development of distributed actors throughout the past couple of months by peeking at some of your git branches, but this is truly impressive, so congratulations.

I'm curious about this:

Once we are confident in the semantics and language features required to get rid of the source generation step, we will be able to synthesize the appropriate Envelope<Message> and represent every distributed func as a well-defined codable Message type, which transports then would simply use their configured coders on.

What language features are required? IIUC, code synthesis is nothing new to Swift, as Codable relies on it extensively. Why do we need to wait for more metaprogramming-related language features to land? Can't this synthesis behaviour be baked into the compiler, like so much of the distributed actors feature already is?

ktoso · September 3, 2021, 1:35am

Thanks for the kind words

We're actively working on seeing what we can do to get rid of the source generation step -- you can assume this bit might change quite a bit.

We wanted to get the pitch out there so we can begin gathering feedback, I'm pretty hopeful we can get rid of the source-gen step eventually with enough sneaky tricks and thunks.

In general the tricky parts are:

we got some bytes, we managed to lookup the actor, we even managed to decode the "message" (assuming we have some known representation for it), but how do we apply this message.
- so we need some "function handle that can be looked up from some serializable id and invoked"
- and how do we do this without forcing Codable and some specific representation
the generation of messages, we would not want to "lock in" forever a specific message representation, say "all funcs have a case in a huge enum" since it may hit limitations or problems for specific transports or just in future evolution of the feature.
- so we need message reprs to be opaque if we were to synthesize them

The good news is that we have some ideas for both of those problems. They would affect the design a bit of course and we'll post a bigger update once we have something to share.

In the meantime it is useful to keep providing feedback about the general feature, thanks in advance!

woolsweater · September 3, 2021, 4:30am

I really don't find sufficient motivation in this feature for burdening Swift with new keywords, new magic dispatch rules, new blessed special cases for synthesis, and new declarations and restrictions that are inexpressible in normal user code. Is this truly a new concept in the language itself, a peer with structs and protocols?

I would urge that this feature be reconsidered as a library-based solution, and the initial effort directed towards enabling that library with general facilities in Swift that are accessible to all language users.

Douglas_Gregor · September 3, 2021, 7:05am

This is a very cool proposal. Swift's actors were "miniaturized" from distributed actors to fit within a process, and this proposal layers distributed actors back on top in a natural way. I have a couple of design-related comments and questions.

Properties

The section on properties describes two restrictions on properties in distributed actors:

They cannot be distributed, so they cannot be accessed from outside the actor.
They cannot be nonisolated

These two restrictions are not equal, though.

The first is a matter of policy: the proposal doesn't allow properties to be accessed from outside the actor because reading properties from a remote actor is slow, and you should probably collect your property accesses into a distributed method instead. However, the same could be said of normal actors: you probably should not read two properties from a normal actor in a row, because some other code could run on the actor in between your two reads and you would get inconsistent results. The await is there in the code to indicate that there's a potential delay here. So, I'd prefer to be consistent with non-distributed actors here and allow property reads from outside the actor, rather than arbitrarily slice out this feature.

The second is more fundamental: one cannot have a nonisolated let property in a distributed actor because doing so would require the let value to be replicated wherever there is a reference to the distributed actor. Aside from the redundant storage, this means that when you resolve an actor address for a remote actor, you would have to communicate with that remote actor to get the replicated data, therefore requiring remote actor resolution to be async. I would prefer for the proposal to expand on the reasons why nonisolated let is a fundamental difference for distributed actors, and make this the only property-based restriction.

Distributed methods

Methods on a distributed actor need to be marked as distributed to be used from outside the actor. To me, this feels like an implementation limitation (that code generators need to know about all of the distributed methods) that has crept into the language design.

I would prefer that we not require distributed on functions to call them from outside the actor. That way, their semantics line up as closely with non-distributed actors as is possible. With non-distributed actors, you can write a method on an extension of an actor and call it from the outside as async:

actor MyActor { }
extension MyActor {
  func f() { }
}

func g(ma: MyActor) async {
 await ma.f() // we can call f() as async when we're outside the actor
}

Distributed actors necessarily need to have calls from outside the actor be throwing, because transports can fail, which is explained well in the proposal. That would imply that we should be able to do this:

distributed actor MyDistributedActor { }
extension MyDistributedActor {
  func f() { }
}

func g(ma: MyDistributedActor) async throws {
 try await ma.f() // we can call f() as async when we're outside the actor
}

The proposal also requires that we mark f as being distributed, but this is unfortunate, because the function g could be defined in a different module:

// module A
public distributed actor MyDistributedActor { }
extension MyDistributedActor {
  public func f() { }
}

// module B
import A
func g(ma: MyDistributedActor) async throws {
 try await ma.f() // can't do this because f() wasn't marked 'distributed'
}

Unfortunately, our author of module B is stuck: MyDistributedActor.f() wasn't originally marked as distributed, and that cannot be fixed without updating module A.

The requirement that DistributedActor-inheriting protocols only have distributed and nonisolated members is another consequence of requiring distributed on functions. If we didn't need distributed on functions, DistributedActor-inheriting protocols could follow the same rules as Actor-inheriting protocols, with the one necessary change that calls outside the actor are both async and throws.

I think we can lift the implementation limitation that requires distributed to be provided ahead of time by using a different approach, that I think would also work across module boundaries. If so, I would prefer to remove distributed func (and the distributed var I implied with my other comments above) from the language entirely, aligning the mental model of distributed actors much more closely with that of non-distributed actors.

`Codable` requirement

Related to the comments above about distributed func, I suspect it's possible to drop the Codable requirement from the proposal by using some kind of ad hoc protocol to interact with the transport. It's probably worth spinning off a separate discussion about the implementability of this, though, and keep more focused on semantics here.

`withLocalDistributedActor`

I love how well this API worked out, even though you clearly don't want folks to actually use it ;). I think you need to make T: Sendable for this to work, however, because when you're running locally there will be no Codable round-trip.

`Equatable` and `Hashable` conformances

I don't quite know if you need these, because I think it depends a bit on the notion of identity that matters. Actors (whether distributed or not) are reference types that conform to AnyObject, so one can compare their identity (with ===) and use ObjectIdentifier to get a Hashable and Equatable type. Can a given process create more than one remote actor instance for the same actor? If so, then identity as defined by === will differ from equality as defined by ==, which feels (to me) like a semantic trap. On the other hand, ensuring uniqueness for remote actor instances means you probably have a big old lock in the actor runtime around actor resolution.

What are the semantics we want here? And if object identity and equality would always be the same, should we leave off the Equatable and Hashable conformances entirely?

If we do keep the Equatable and Hashable conformances, I think this paragraph needs an update to reflect SE-0309.

The property uses a type-erased struct to store the identity, because otherwise Hashable 's Self type requirements would prevent using DistributedActor bound protocols as existentials, which is a crucial use-case that we are going to implement in the near future.

Separating client and server implementations

The future work section on resolving DistributedActor bound protocols talks about using a distributed actor protocol where the client and server have different implementations, and gives a protocol as an example:

protocol Greeter: DistributedActor {
  distributed func greet(name: String) throws -> String
}

I don't think Greeter benefits from being considered a distributed actor: you're not likely to get much use out of code synthesis for the message send implied by a call, when (e.g.) the server is implemented in some other language. Instead, I think distributed actors should focus in on the case where you are sharing the code across all of the cluster nodes/processes/etc. If someone would like to separate the client from the server, that can be done with a normal protocol that has async throws operations on it:

protocol Greeter {
  func greet(name: String) async throws -> String
}

Now, this absolutely can be implemented by a distributed actor:

distributed actor MyGreeter: Greeter {
  func greet(name: String) -> String {
    "Hello \(name)!"
  }
}

but it can also be implemented separately on client and server, perhaps with bespoke implementations to match some existing protocol (gRPC or whatever). I have no doubt that some class of those implementations could be autogenerated from the protocol definition, and I would hope that some of the things we learn from working on distributed thunk synthesis can help there, but I think it's important not to view these protocols as distributed actors.

That came out a lot longer than expected. All that said, I think we're on a trajectory to build something great for distributed computing.

Doug

hassila · September 3, 2021, 11:01am

Another reflection:

What are the thoughts with regard to protocol compatibility between a “client” and remote service? Would distributed actors be limited to homogenously compiled “compatible” systems or is it envisioned to have some sort of protocol compatibility checks and support evolution of data transferred?

Many protocols allow backwards compatible evolution to decouple service/client - will this be possible with distributed actors?

Saklad5 · September 3, 2021, 12:47pm

I agree that this sort of complexity is an extremely slippery slope, and should be approached extraordinarily carefully.

Like with SwiftUI, the language features necessary to make this work should be as broad and versatile as possible. I am personally somewhat skeptical about distributed as a keyword: it seems too specific.

I think the best path forward is making the new language concepts as orthogonal as possible, to enable functionality beyond the immediate proposal. Off the top of my head: I think some of these concepts, particularly the throwing behavior, could be broadened to fit delegate patterns in general.

Saklad5 · September 3, 2021, 1:21pm

I know people are skeptical about typed errors, but I think that they open up a lot of extremely useful safety features. Like with Result, nothing would stop you from throwing Error, and I think that could be an implicit thrown type in the same way that Void is implicitly returned. In terms of type signature, meanwhile, you could say non-throwing types throw Never.

The ability to opt into a stricter API contract would be very helpful though, especially for non-public code. I am often tempted to avoid throwing entirely in favor of using Result, purely for that additional safety.

As for public typed throws, many options come to mind:

A number of people have proposed adding an enumeration attribute that requires @unknown handling even when frozen. Adding new cases to such an enumeration would be additive rather than breaking, which may be fitting for thrown errors.
An extremely common pattern consists of wrapping thrown errors in an associated case. This allows fine-grained control over exposed type information, hiding implementation details that should not be part of the API contract.
Errors do not need to be enumerations, common as that is: developers may choose to throw structs for particularly dynamic errors.

I recognize that typed throws are beyond the scope of the pitch, but I do think it is preferable to introducing bespoke requirements. Remember, it would always be possible to cast to Error: at worst, you end up with the current state of affairs.

Finally, I firmly believe that everything in documentation should be considered part of the API contract. If you document any thrown errors (and you should), changing which errors you throw (and even which reasons you throw them for) is likely already an additive or breaking change. Typed errors would merely make the compiler acknowledge that.

Saklad5 · September 3, 2021, 1:59pm

A lot of Swift’s language complexity already comes from not having typed errors, and I do not wish to add more.

public func map<T>(
    _ transform: (Element) throws -> T
  ) rethrows -> [T]

could become syntactic sugar for

public func map<E, T>(
    _ transform: (Element) throws E -> T
  ) throws E -> [T]

ktoso · September 3, 2021, 2:02pm

I appreciate the input, but this pitch/proposal isn’t where typed throws are going to happen. Let’s please focus on the topics at hand, it already is a very large and interesting surface area by itself. Nothing in the design prevents us from adopting typed throws IF they were to happen in a future swift release.

Hope this makes sense, thanks!

Saklad5 · September 3, 2021, 2:06pm

Sorry, you’re right. I do think this proposal should avoid burying too much in the language, though. It may even be best to make the implementation dependent on such a feature and delay it accordingly. Or just drop all Error-related requirements for now.

ktoso · September 3, 2021, 2:07pm

There are no Error related requirements in this proposal.

The “how to transfer errors” section is merely suggestions to transport authors given the status quo of the language.

Saklad5 · September 3, 2021, 2:08pm

Errors thrown by the underlying transport due to connection or messaging problems must conform to the ActorTransportError protocol.