Using Distributed Actors with non-swift clients

mpilman · April 20, 2023, 10:05pm

I am currently building a system on top of GRPC. However, I would prefer using distributed actors.

It doesn't look too difficult to implement a DistributedActorSystem so I could start there. However, one of the requirements I have is that I want something that can be used from non-swift clients. This is also why I started with grpc.

So my question is: does anyone know an existing network protocol that I could implement that would fit the distributed actor model well? I could obviously come up with my own protocol, but if possible I rather use something existing (I wouldn't need a swift implementation, I would be ok implementing it myself)

lukasa · April 21, 2023, 8:30am

I think almost any RPC protocol should work: grpc, JSON-RPC, etc. grpc is probably a good fit due to its broad support, binary transport, and ability to multiplex traffic to minimise the number of concurrent connections you need in a fully-connected graph.

mpilman · April 21, 2023, 2:19pm

RPC doesn't seem like a good fit as the model is fundamentally different from distributed actors. Distributed actors are more like CORBA, as in they are object oriented.

It would be possible to use grpc, but it feels very unnatural. Grpc (and rpc in general) doesn't have a mechanism to modify objects that live on the server like distributed actors have. This means with GRPC I'd have to expose implementation details into the API (like the actor id would need to be part of each message I think).

lukasa · April 21, 2023, 2:27pm

I don't think those implementation details can practically be avoided. At a basic level it is necessary for some notion of object identity to be carried in the network traffic. There are a number of places it could go: you could use IP addresses, or TCP/UDP ports, or something at a higher level of the networking stack, but somewhere there will be a need for identity.

I don't think modelling that identity at an RPC layer is any more incorrect than modelling it at the network layer. Indeed, modelling it at the network layer is arguably more of a layering violation, as it forces your network to understand your actor system.

If you really wanted to go down that road I think the way I'd consider doing it is by borrowing the underlying model from Project Calico (disclosure, I worked on Calico long ago, before Tigera came into existence). You could assign each actor instance in your actor system a unique IPv6 address and then have their host node publish a route for that IPv6 address. To minimise the strain on the network you could aim to assign them out of a per-node /96, but you could still achieve location transparency by allowing nodes to publish more specific routes outside of their /96.

But fundamentally I think this is vastly more painful than just having your RPC system pass the actor ID into each message. This is much like writing object-oriented C: each procedure takes a pointer (actor ID) as its first argument.

mpilman · April 21, 2023, 3:48pm

Of course, but the question is whether it needs to be exposed to the user or not. I also would agree that it isn't a layering violation. I even agree that it's like adding a C api to a C++ library (it's an excellent comparison).

However, I mostly think that your answer isn't really an answer to my question (or at least the question I meant to ask, I'm not a native speaker so I might need to rephrase a bit).

High level, most network protocols consist of an envelope and a message. For GRPC the envelope contains the name of the service and the message is a protobuf (by default) serialized message or a stream of messages. Now if you want to implement distributed actors you need to put this whole thing into another envelope that now becomes part of this message.

This is certainly possible and there are many ways of doing this. But you end up designing a new protocol on top of GRPC. So to me it seems the statement "GRPC can be used to implement a distributed actor system", while true, is similar to saying "TCP can be used to implement a distributed actor system".

And I am not even saying that building on top of grpc wouldn't be a reasonable way of doing this. But before I do something like this, I was wondering whether there's something that I could reuse which models something like distributed actors better than grpc.

The closest I know is FDBs wire protocol. But that protocol isn't really standardized, but would probably be quite easy to reimplement in other languages.

godofbiscuits · June 1, 2023, 3:21pm

I have to preface this with saying I'm extremely out of my depth here, but I've been coming up to speed on Elixir/OTP, and when Apple dropped Distributed Actors on us, it seemed like there at least should be some way to bring the two together. OTP has been written to extend seamlessly across nodes as you go, something that there are whiffs of with Distributed Actors, at least in the WWDC 2022 Tic Tac Talk on. :)