This is a very cool proposal. Swift's actors were "miniaturized" from distributed actors to fit within a process, and this proposal layers distributed actors back on top in a natural way. I have a couple of design-related comments and questions.
Properties
The section on properties describes two restrictions on properties in distributed actors:
- They cannot be
distributed
, so they cannot be accessed from outside the actor.
- They cannot be
nonisolated
These two restrictions are not equal, though.
The first is a matter of policy: the proposal doesn't allow properties to be accessed from outside the actor because reading properties from a remote actor is slow, and you should probably collect your property accesses into a distributed method instead. However, the same could be said of normal actors: you probably should not read two properties from a normal actor in a row, because some other code could run on the actor in between your two reads and you would get inconsistent results. The await
is there in the code to indicate that there's a potential delay here. So, I'd prefer to be consistent with non-distributed actors here and allow property reads from outside the actor, rather than arbitrarily slice out this feature.
The second is more fundamental: one cannot have a nonisolated let
property in a distributed actor because doing so would require the let
value to be replicated wherever there is a reference to the distributed actor. Aside from the redundant storage, this means that when you resolve an actor address for a remote actor, you would have to communicate with that remote actor to get the replicated data, therefore requiring remote actor resolution to be async
. I would prefer for the proposal to expand on the reasons why nonisolated let
is a fundamental difference for distributed actors, and make this the only property-based restriction.
Distributed methods
Methods on a distributed actor need to be marked as distributed
to be used from outside the actor. To me, this feels like an implementation limitation (that code generators need to know about all of the distributed methods) that has crept into the language design.
I would prefer that we not require distributed
on functions to call them from outside the actor. That way, their semantics line up as closely with non-distributed actors as is possible. With non-distributed actors, you can write a method on an extension of an actor and call it from the outside as async
:
actor MyActor { }
extension MyActor {
func f() { }
}
func g(ma: MyActor) async {
await ma.f() // we can call f() as async when we're outside the actor
}
Distributed actors necessarily need to have calls from outside the actor be throwing, because transports can fail, which is explained well in the proposal. That would imply that we should be able to do this:
distributed actor MyDistributedActor { }
extension MyDistributedActor {
func f() { }
}
func g(ma: MyDistributedActor) async throws {
try await ma.f() // we can call f() as async when we're outside the actor
}
The proposal also requires that we mark f
as being distributed
, but this is unfortunate, because the function g
could be defined in a different module:
// module A
public distributed actor MyDistributedActor { }
extension MyDistributedActor {
public func f() { }
}
// module B
import A
func g(ma: MyDistributedActor) async throws {
try await ma.f() // can't do this because f() wasn't marked 'distributed'
}
Unfortunately, our author of module B is stuck: MyDistributedActor.f()
wasn't originally marked as distributed
, and that cannot be fixed without updating module A.
The requirement that DistributedActor
-inheriting protocols only have distributed
and nonisolated
members is another consequence of requiring distributed
on functions. If we didn't need distributed
on functions, DistributedActor
-inheriting protocols could follow the same rules as Actor
-inheriting protocols, with the one necessary change that calls outside the actor are both async
and throws
.
I think we can lift the implementation limitation that requires distributed
to be provided ahead of time by using a different approach, that I think would also work across module boundaries. If so, I would prefer to remove distributed func
(and the distributed var
I implied with my other comments above) from the language entirely, aligning the mental model of distributed actors much more closely with that of non-distributed actors.
Codable
requirement
Related to the comments above about distributed func
, I suspect it's possible to drop the Codable
requirement from the proposal by using some kind of ad hoc protocol to interact with the transport. It's probably worth spinning off a separate discussion about the implementability of this, though, and keep more focused on semantics here.
withLocalDistributedActor
I love how well this API worked out, even though you clearly don't want folks to actually use it ;). I think you need to make T: Sendable
for this to work, however, because when you're running locally there will be no Codable
round-trip.
Equatable
and Hashable
conformances
I don't quite know if you need these, because I think it depends a bit on the notion of identity that matters. Actors (whether distributed or not) are reference types that conform to AnyObject
, so one can compare their identity (with ===
) and use ObjectIdentifier
to get a Hashable
and Equatable
type. Can a given process create more than one remote actor instance for the same actor? If so, then identity as defined by ===
will differ from equality as defined by ==
, which feels (to me) like a semantic trap. On the other hand, ensuring uniqueness for remote actor instances means you probably have a big old lock in the actor runtime around actor resolution.
What are the semantics we want here? And if object identity and equality would always be the same, should we leave off the Equatable
and Hashable
conformances entirely?
If we do keep the Equatable
and Hashable
conformances, I think this paragraph needs an update to reflect SE-0309.
The property uses a type-erased struct to store the identity, because otherwise Hashable
's Self
type requirements would prevent using DistributedActor
bound protocols as existentials, which is a crucial use-case that we are going to implement in the near future.
Separating client and server implementations
The future work section on resolving DistributedActor
bound protocols talks about using a distributed actor protocol where the client and server have different implementations, and gives a protocol as an example:
protocol Greeter: DistributedActor {
distributed func greet(name: String) throws -> String
}
I don't think Greeter
benefits from being considered a distributed actor: you're not likely to get much use out of code synthesis for the message send implied by a call, when (e.g.) the server is implemented in some other language. Instead, I think distributed actors should focus in on the case where you are sharing the code across all of the cluster nodes/processes/etc. If someone would like to separate the client from the server, that can be done with a normal protocol that has async throws
operations on it:
protocol Greeter {
func greet(name: String) async throws -> String
}
Now, this absolutely can be implemented by a distributed actor:
distributed actor MyGreeter: Greeter {
func greet(name: String) -> String {
"Hello \(name)!"
}
}
but it can also be implemented separately on client and server, perhaps with bespoke implementations to match some existing protocol (gRPC or whatever). I have no doubt that some class of those implementations could be autogenerated from the protocol definition, and I would hope that some of the things we learn from working on distributed thunk synthesis can help there, but I think it's important not to view these protocols as distributed actors.
That came out a lot longer than expected. All that said, I think we're on a trajectory to build something great for distributed computing.
Doug