Bandwidth Usage for Distributed Actors

rezwits · May 10, 2024, 8:44pm

Hi, and thanks for welcoming me to the community. I have been privately tinkering on an App/Game for a few years and finally have made some good progress with the evolution of Swift.
I have some initial questions concerning where I am at because of bandwidth issues. My game is just in alpha prototyping still testing various technologies and seeing how things work out.

I want to be strict and to the point hoping some of you will be able to easily understand what I am trying to relay, (pun intended?) HAHA

Anyway, I have a game currently (May 10, 2024) running solid, with Async/Await Distributed Actor Framework, using just for the simplest case TicTacFish Bonjour Networking sample package to network players moving around SpriteKit avatars on screen with a this topography:

AppleTV 4K (2022), for viewing purposes, with an iPhone 15 Pro Max, an iPad Pro M1, and a MacBook Pro M1 Pro. Here are some bandwidth numbers:

When connecting say two, the MBP to iPad, a low 60fps "hum" from machine to machine, is at 8KB/s, when one avatar moves around "networked" the usage goes up to 40KB. I can increase or double this base value by increasing the frequency of updates or distributed actor func calls.

If both are moving i.e. updates to both the usage goes to 80KB to 160KB.

When I add the MBP to iPad with iPhone, I get a "bar" of 40KB per each, so 120-160KB but now three times the load is about 360-480KB.

When I add the 4th, the AppleTV, the usage goes up to 768KB to 1MB/s.

When I have the avatars moving faster with twice the amount of updating the Bonjour networking was up to 2MB with all 4 in concert moving at that PEAK. But like I said I cut their firing rate in half and it's about ~750KB-1MB.

My question is actually a few, but let me preface; Back in the days of the journey, using say Sockets and OpenGL, my game would run on old 802.11n bonjour networking, and if I had a similar setup with 3 players my network usage would be say around 120KB/second TOTAL! for 3 players running around. This was the same with say CFNetwork usage, or even other later technologies (what a mess), i.e. when I handled the data sending myself via Streaming Bonjour or Server stream setups. I had absolute POWER haha control over how much data was being sent.

The thing is I have been waiting for a Framework like Distributed Actors for well over 10 years, to do things like, when characters running around are picking things up having to fight over who "actually" picked things up, with distributed actors there is NO DEBATE, and believe me that is a pain to debug by yourself!

I am still in the process of cutting things down etc, and have numerous ideas to assist when I move to using the Distributed Actor Cluster Network package. But the questions I have are:

is this kind of "Spiraling Network Bandwidth Growth" indicative of Distributed Actor Networks, ex. when adding more actors?
Meaning "that's just the way it is?" or am I doing something wrong?
Because I was hoping for some kind of sharing of the load when it came to "distributing" the data. But I could also see how the ActorSystem would need to distribute and verify the data at least more than once, i.e. not to just one peer, each peer would have to share and then have some kind of graphing system where a different peer would verify where the other actor(s) are at, but I just don't see that in this implementation, being the simple Bonjour ActorSystem in TicTacFish, taking that on (of course), so back to the original question.
I am going to move to the WebSocket verison, to check out the bandwidth, and then eventually the Cluster, but I am hoping Apple will demo a more polished version of the Distributed Actor System? Really love it! Do any of you see this happening? Or is there something else going on here that I am not too keenly observing?

THanks!

NOTE: Just to be clearer, in summary I am asking, if say, I had 3 Actors sending 40KB/s UP to a server and then the server sending three transmissions back to each at 80KB/s DOWN, i.e. 240KB/s DOWN and 120KB/s UP (server's data bandwidth) that would be pretty much all. But in a "peer to peer" setup do we usually get this out of control spiraling network bandwidth growth even with our beloved Distributed Actors?

ktoso · May 11, 2024, 8:24am

Hi Thomas,
sorry I'm having a bit of difficulty parsing your exact usage and questions, I'll answer the best I can but will also ask for clarification along the way.

"Distributed actors" are not a framework per se, but a language feature that enables "bring your own" actor system implementation that then is responsible for all the runtime things -- specifically, how networking is done. So all your bandwidth questions are going to be specific to which exact actor system implementation you are using.

So you say you're using the TicTacFish system implementation -- it should be noted that that implementation is VERY "toy example". We never spent any time optimizing it, it's just an "hey, you could build a system like this!" example, so I would not recommend using it directly -- but instead using it as inspiration to build your own, more optimized system. The general networking will likely be the same, but I'm sure there is a lot of room for optimization.

It's hard to know what exactly this means -- sending updates per frame is likely going to have huge overhead, since the requests include mangled names of targets, so even if sending around two integers with positions... the system would also keep serializing a large string with the invocation target -- you should measure this yourself by exploring the actor system. This is possible to optimize, sending the target string only once etc, but again, the TicTacFish example never explored optimization angles.

I'm also unsure about the frequency of updates you're actually doing and what you actually have to keep synchronizing or not. It seems suspicious to me that you mention frame rate, do you really send updates per frame, that seems wasteful?

I guess this is a longwinded way to say that I don't know what you're measuring or if your system of sending updates is actually efficient to begin with or not etc. I recommend diving deeper into the system and measuring payloads and times of rendering json etc -- and optimizing accordingly. I'd be happy to provide hints if you hit specific issues, but from this writeup it's hard to even pinpoint what we need to be optimizing: your sync algorithms, serialization, amount of traffic, [...]?

Same as above, I can't know what this means really without you explaining more about your sync architecture. Are you literarily fanning out updates per frame to every client connected to the game? This probably would keep growing traffic exponentially, sure - regardless of specific transport employed.

You have exactly as much power with distributed actors, because you can implement all of the networking yourself -- and distributed actors are just a language feature that allows expressing RPC calls as nice looking swift methods really. If you want to optimize for speed and game situations, you really should be looking at implementing your system, rather than using the un-optimized TicTacFish example system. This is a feature of the distributed language feature, that anyone can implement any specialized actor system they want, in the absence of one that suits their specific needs.

It would be fantastic to collaborate on a local networking "game" focused open source system implementation The best documentation of how things fit together are the Swift Evolution proposals, and the API docs on DistributedActorSystem, if you're not sure about anything please let me know -- and we'd like to include more/better docs in the Swift book as well.

The cluster package is not really designed for local network -- it is designed for datacenters, and is very "chatty" for health monitoring of the peer nodes --(it uses a SWIM failure detector) while it may work, I would not recommend using it on devices -- your best bet is customizing the bonjour example system to suit your needs IMHO.

This doesn't exist today; More efficient synchronization systems using e.g. CRDTs are something that we're interested in but it would NOT replace distributed func calls but simply be another API, that happens to be implemented using distributed actors.

Distributed funcs and actors are specifically "just RPC", and it makes building other distributed systems much simpler, but it is not like "distributed actor" automatically "distributes state".

In your case I think bonjour will be the right thing honestly, but just spending some time on measuring and optimizing. The sample system from the talk just isn't meant for production and that's expected.

The best way forward with distributed actors is to start working on open source packages which implement the specific situations you're looking for and this way we'll get a nice ecosystem of actor system implementations. I can't comment on any official plans with shipping or not shipping some system implementations, though either way open source packages is the way to go here!

For example, Ordo One have designed an implementation suited for their needs over here -- it's also for server clusters though so won't be suited for your use case. But that's the general idea -- specialized implementations as open source packages and then communities can benefit from them

Again those questions are not really actor specific -- it depends what and where you're sending. If you're doing everyone-to-everyone sending all updates then sure, it will be inefficient like this. The up vs down also doesn't have much to do with actors themselfes but with the size of payloads youre putting on the wire -- sure, the sample impl isn't efficient in it's framing, but it still would depend the most on what you're sending around as payloads, and how often etc...

I hope this helps adjust the mindset and maybe we can measure in detail the message sizes you're sending around, frequency, and figure out if your overall design okey or needs tweaking. If you find that you have to optimize the RPC sizes specifically, I can help with that, but if it's around architecting state sync you'll have to think about it a bit yourself I guess? I'm happy to chat more but we'll need more details

rezwits · May 11, 2024, 8:59am

OH GREAT!! Thanks for this detailed reply. I will answer each of your topics like you answered mine by Monday or Tuesday, as I have to do a Cat Mom (Mother's Day) on Saturday and then an actual Mother's Day on Sunday.
But for a quick reply, yes, yes, and, yes. As in I am doing the sending of actions, coordinates, rotations, etc on an FPS throttled setup, where I can go from 1 frame per second of data up to 120 frames per second of data and make it flood and crash constantly LOL...hehe
But yeah I am working on optimizing these such things.
My only real concern was, I was really hoping to get to use an actual "drop-in" Distributed Actor System as to save time and resources along with headache having test... but yes I understand that the TicTacFish IS a very simple example. I just was wanting to see it in use mainly, and specifically, for my use case which is when two Actors are headed to the same ITEM on a "playing field" if they both get there at pretty much the same time:

1st) The object/item is assigned to one of the two and NEVER to both, i.e. the one who is TRULY "FIRST" to grab it gets it, and
2nd) There are never any crashes with the TONS of async code. From each character's update method. Among other functions...

There's tons more. But yeah just to see things in action, as Distributed Actors are heating up!

Thanks again!

In actuallity I didn't realize until I typed up the whole speal, that yeah ya big dummy (you're doing peer to peer where everyone sends to everyone, of course it's gonna be a waste of bandwidth compared to a dedicated server setup when the number of users gets higher! duh).

rezwits · May 16, 2024, 8:46pm

ugh, true

ugh, true, that's what I am doing.

ktoso:

It's hard to know what exactly this means -- sending updates per frame is
...
I'm also unsure about the frequency of updates you're actually doing and what you actually have to keep synchronizing or not. It seems suspicious to me that you mention frame rate, do you really send updates per frame, that seems wasteful?

I guess this is a longwinded way to say that I don't know what you're measuring or if your system of sending updates is actually efficient to begin with or not etc. I recommend diving deeper into the system and measuring payloads and times of rendering json etc -- and optimizing accordingly. I'd be happy to provide hints if you hit specific issues, but from this writeup it's hard to even pinpoint what we need to be optimizing: your sync algorithms, serialization, amount of traffic, [...]?

I hope this was clarified in my quickReply...

Same as above, I hope this was clarified in my quickReply...

ktoso:

You have exactly as much power with distributed actors, because you can implement all of the networking yourself -- and distributed actors are just a language feature that allows expressing RPC calls as nice looking swift methods really. If you want to optimize for speed and game situations, you really should be looking at implementing your system, rather than using the un-optimized TicTacFish example system. This is a feature of the distributed language feature, that anyone can implement any specialized actor system they want, in the absence of one that suits their specific needs.

It would be fantastic to collaborate on a local networking "game" focused open source system implementation The best documentation of how things fit together are the Swift Evolution proposals, and the API docs on DistributedActorSystem, if you're not sure about anything please let me know -- and we'd like to include more/better docs in the Swift book as well.

That would be nice, we'll see after this WWDC '24, and where my platforms, err devices lie...
I am really hoping for an AppleTV @120fps, but it seems like it will require tvOS 18, which is gonna knock out some of my devices, along with other concerns but we'll see where things lie? or lay?

My networking architecture is much more complicated than just the Cluster, I have other layers they are called (for instance):
clusters
bubbles
pools
ponds
pods

This TicTacFish was just a quick test... I just got flustered with seeing the peer to peer bulking up to 2MB/s haha I was like whoa whoa... and I figured I would go ahead and finally post something on the Swift Forums.

ugh, true

The only reaosn I wanted to move to trying out the WebSocket version is because I have a WebSocket I designed for my Vapor Servers and wanted to compare the LAG with moving stuff around with that WebSocket implementation and the Distributed Actors implementation, which I understand the TicTacFish is just a demo, got it.

ktoso:

In your case I think bonjour will be the right thing honestly, but just spending some time on measuring and optimizing. The sample system from the talk just isn't meant for production and that's expected.

The best way forward with distributed actors is to start working on open source packages which implement the specific situations you're looking for and this way we'll get a nice ecosystem of actor system implementations. I can't comment on any official plans with shipping or not shipping some system implementations, though either way open source packages is the way to go here!

For example, Ordo One have designed an implementation suited for their needs over here -- it's also for server clusters though so won't be suited for your use case. But that's the general idea -- specialized implementations as open source packages and then communities can benefit from them

Thanks for the link!

ktoso:

Again those questions are not really actor specific -- it depends what and where you're sending. If you're doing everyone-to-everyone sending all updates then sure, it will be inefficient like this. The up vs down also doesn't have much to do with actors themselfes but with the size of payloads youre putting on the wire -- sure, the sample impl isn't efficient in it's framing, but it still would depend the most on what you're sending around as payloads, and how often etc...

I hope this helps adjust the mindset and maybe we can measure in detail the message sizes you're sending around, frequency, and figure out if your overall design okey or needs tweaking. If you find that you have to optimize the RPC sizes specifically, I can help with that, but if it's around architecting state sync you'll have to think about it a bit yourself I guess? I'm happy to chat more but we'll need more details

The only reason I considered these actor specific, was because I am unsure of what actually gets sent when making distributed actor func calls. They are not simple "8 byte" additions, they have their own size but this may not be that small or that large. I don't know the inner workings of how the Apple/Swift Team designed the calls. Because as you know if you look at any old complicated object in the debugger the properties can sometimes be endless amounts of hidden "home cooking" properties.

p.s. Sorry I didn't get back sooner, I got caught up and blindsided by the Google I/O keynote... gotta keep tabs... oh and we're good on the multi-part quoting LOLOL