Web Workers in Swift Wasm via DA

Geordie_J · September 27, 2022, 9:27pm

I have been interested in DA since it was announced, and today I thought of a potential use-case that I'd like to run by the community to see if it makes sense or if there's a better (easier / cleaner) way of doing it.

Today we launched a project that uses Swift Wasm to build parts of our mobile app for use in our web app in order to share code and improve performance there. We were able to use 100% of our mobile app's business logic for the most essential part of our app: the song "Player", where you learn how to play songs on piano. The project was a lot of fun for our team, and a huge success for us goals-wise, so we'd like to continue to improve what we've made there.

What I'd like to do next is move our (Swift) audio processing code off the main thread and into a Web Worker / Audio Worklet. There are many ways we might do that "by hand", but it seems to me like it could be a great use-case for DA.

From what I understand we could instantiate another instance of our Wasm binary in a web worker and serialize calls (i.e. messages) to and from the worker via a DistributedActorSystem. I think we'd have to do this because a worker cannot share memory with the JS main thread. Since distributed actors – kind of by definition – don't share memory space with each other, and pass messages that are serialized and deserialized at a defined boundary, it seems like the abstraction would fit quite well.

For our use case it'd probably be enough to call methods on the distributed actor (worker) and have it "return" a single value for each call. i.e. for our use case there can be a 1:1 relationship between calls to the actor and calls back from it. So that should be easy enough to implement.

That said, it'd be interesting to know if there are standard ways for "child" workers to call back to the "parent" process with DA (for later / other, more general, use cases). Is this just a case of creating another DA on the "parent" and calling it directly from the child? Or are there other, better, ways?

Is there something I'm missing? Basically I'd write the Distributed Actor System as follows: it would run some JavaScript code (via the fantastic JavaScriptKit), which creates a Worker that instantiates another copy of the Wasm binary (effectively forking the Wasm process in another thread). The JavaScript code it runs in the worker would set up the message exchange and the Swift DA-System code written would deal with (de)serialization.

Am I crazy? Or is this a good use case for DA?

edit: my biggest concern after looking at the DA repo is that the feature itself appears quite heavyweight. We are looking to save binary size where we can and it appears DA has dependencies on Foundation, NIO, Atomics, and more – this may end up being a blocker for us

ktoso · September 28, 2022, 12:11am

Hi there,
yeah that's definitely worth a try -- DA honestly are just "a framework (language feature) to build RPC frameworks" So if you have cases where there's calls between things and they can share source but won't be sharing memory -- that's exactly what distributed actors can be used for, to implement the RPC calls.

To clarify right away:

I assume you're looking at GitHub - apple/swift-distributed-actors: Peer-to-peer cluster implementation for Swift Distributed Actors - in that case, you're looking at a server side cluster implementation.

This is completely unrelated to what you'd be building and using in wasm, so none of the dependencies or complexity applies to your wasm use case.

--

Having that said, I don't know if the wasm runtime can support the internals of distributed actors, we never tried. I suspect it is doable, but I just don't know if the runtime can handle it. It boils down to a lookup table of distributed target pointers (pointers to distributed funcs which we lookup and invoke), I'm not sure if that'd "just work" in wasm or if it has to be reimplemented there -- I suspect it would have to be though, since it is the runtime Distributed and Concurrency library which are doing this (and those are C++ implementations).

Library implementation wise, the only types you'd need to implement are contained in here, and documented in how they interact with eachother: https://github.com/apple/swift/blob/main/stdlib/public/Distributed/DistributedActorSystem.swift The Swift (Distributed) runtime parts are what implements the executeDistributedTarget function, and that's what I'm unsure about in wasm (at least as of now).

Long story short: definitely a great use-case, but I'm not sure if wasm runtime can support distributed actors today. If someone who knows more about the runtime could chime in that'd be helpful Maybe @Max_Desiatov ? Would be great if some folks who know the runtime could get together and push for support of distributed actors in there

ktoso · September 28, 2022, 12:12am

You just create a bunch of actors and model parent/child in terms of who holds references to whom etc. There's no special parent/child relationships of actors in Swift at this point.

Swift actors don't have supervision trees, though the distributed actors cluster library does have LifecycleWatch which is similar to supervisors from other distributed actor runtimes.

I now realize I perhaps misunderstood what you meant by child actor, not as much in the traditional sense but that "the actor in the other memory isolated domain". For that you'd be using whatever means are used to spawn workers in wasm, I don't know those APIs We can dig into it together though if you want.

Geordie_J · September 28, 2022, 7:31am

The JS main thread and Web Workers kind of necessarily have a parent/child relationship. A Web Worker is restricted in many ways in that it doesn't have as much autonomy or control over its environment (cannot access the DOM or other main thread things, generally has more limited API surface available). So the main thread always spawns a worker (which in turn can spawn other workers).

In the "native" (non-Swift) JS API, the main thread communicates with workers via workerInstance.postMessage([...]) and the child communicates back to its parent via top-level postMessage([...]), or to its own "child" workers via subworker.postMessage([...]).

One potential issue I see with implementing this via DA is confusion why any given distributed actor – as viewed from Swift – cannot communicate with any given other, e.g. "sibling", actor. Instead, the hierarchy described above would have to be respected. Alternatively the DASystem would have to set up a "routing" chain, which quickly could get pretty complex.

Is it acceptable / normal that random DA "A" cannot communicate with (get a reference to?) random DA "B"?

ktoso · September 28, 2022, 12:31pm

Since distributed actors are "just" a "framework to build RPC frameworks", what rules you impose onto the final design is all up to you -- you could build whatever rules make sense for your transport.

A "reply" you could totally handle as usual, since it is handled in an incoming message handler and the user function's return is offered to executeDistributedFunc - from there you'd invoke the "reply" (postMessage()), so that's all doable. You can also enforce and just throw if an actor were to try to message some other distributed actor it can't message to: that's why distributed func are always throwing: you can surface such transport level errors this way.

@Max_Desiatov was saying that we probably would need to double check some things in WASM so that Distributed works there, but we have not tried yet. Technically there isn't all that much special sauce there, so maybe it's just something small missing -- but I'm not the right person to dig into WASM sadly. Semantics wise I don't see there being anything preventing such design.

Geordie_J · September 29, 2022, 1:37pm

@Max_Desiatov I implemented a strongly typed Web Worker transport yesterday which is working well with Swift Wasm. So work-wise this is no longer necessary for us, but I would be happy to push this forward as a personal project, because I find it interesting and I think others could also benefit from it.

How do we sync about what would be required to get Distributed building/working for Swift Wasm?

Geordie_J · October 3, 2022, 2:15pm

@kateinoigakukun my understanding is that you’re currently busy working on a 5.7 release of Swift Wasm. Awesome stuff!

If you have some time after that, would you be interested in talking about how we might support DA in future?

ktoso · October 3, 2022, 2:18pm

If you'd need any advice I'm happy to help, I'm also in Tokyo @kateinoigakukun by the way.
I have no idea about the wasm runtime, but happy to assist anyone who would be willing to look into this, thanks in advance! I think it'd be really awesome to be able to use distributed actors with wasm applications.

Geordie_J · October 3, 2022, 4:03pm

Actually, I can see that the 5.7.1 toolchain was released last week. So maybe we can sync sooner than I expected (that said, I am on a boat this week, so the start of next week would be the earliest).

@ktoso the 5.6 toolchain required an experimental distributed flag to be enabled to import Distributed and use DA. Does 5.7 signify a more general / less experimental release?

kateinoigakukun · October 3, 2022, 6:47pm

First of all, thank you @Geordie_J for your use of SwiftWasm (also for your sponsorship to us)

I think Swift on Wasm can support Distributed Actor in theory.

For the distributed actor in SwiftWasm, we already checked libswiftDistributed.a can be built for wasm target and shipped in our toolchain. However, I haven't spent much time testing it yet, sorry.

I took a glance at the Distributed library now, and I found that we forget to scan swift5_accessible_functions in SwiftRT-WASM.cpp.
Therefore, the Distributed library shipped in our current snapshot toolchain doesn't work at all for now without fixing the runtime, I think.

So I made the first step for the support by fixing the runtime here

I'll try to run a simple application with DA on wasm after merging it.

kateinoigakukun · October 3, 2022, 6:53pm

Yeah, I also think DA wirh Web workers is an interesting application for both SwiftWasm and DA. After a few experiments on my side, I would ask some questions Thanks in advance!

ktoso · October 3, 2022, 9:29pm

This is a stable feature in Swift 5.7

Geordie_J · October 3, 2022, 9:30pm

Really cool! I’m assuming the updated runtime will require a new toolchain release?

I guess the simplest thing we could do would be to get a distributed actor working with a local “dummy” DA system? I’m hoping once that is working we can implement the rest in the usual, “user land”, way – that’s where I feel I can be the most help.

Thanks to you and the rest of the team for Swift Wasm. It’s really great stuff!

kateinoigakukun · October 4, 2022, 4:10am

Yes, it requires a new snapshot release. It will be released automatically after merging the PR.

I think so

ktoso · October 4, 2022, 4:16am

I don't have much exprience with wasm projects, but for a simplest DA PoC perhaps could aim for:

share the same distributed actor type in a "Shared" module between wasm and a server lib
- we don't have the ability to just share a common protocol just yet, so share a concrete distributed actor for now
see if we can implement a websocket transport for remote calls
actor identity perhaps would simply be the ws address and for the simplest PoC you'd have the wasm app open a connection to server and advertise the ID of the actor as ws://.../#1234
server would be listening for such "hello, here i am"

Note that the actor system should be the same type on server/wasm, but it can run in a "wasm client mode" when it is in the client and be the server that does a "bind" when it is the server -- you can check the wwdc sample app for a similar pattern

I'm more than happy to provide help and guidance once we get to implementing the actor system here, would be awesome to see it come together

Geordie_J · October 11, 2022, 3:09pm

I got pretty far with an implementation for Web Workers via DA with the latest branch of Swift Wasm (thanks @kateinoigakukun!)

What I'm unsure about is how to do the actor IDs properly. What I'd really like is to just say WebWorker.resolve(id: .anyAvailable, system: ...) which is possible with an enum, but it's really hacking the ActorID in a way that surely was not intended.

In general, I don't really understand the idea of MyDistributedActor.resolve(id: myID) if the IDs are assigned by the system. How are we ever supposed to know what the ID is if we're also not supposed to set the ID in MyDistributedActor, but rather, it should be assigned by MyDistributedActor.ActorSystem?

What I will do for now is create an enum with a single case enum WebWorkerActorID { case singleton } and always resolve it. Later I'd like to allow multiple web workers per WebWorker type, which wouldn't be possible like this, but I can solve that later I guess.

Gerzer · October 11, 2022, 3:37pm

The ID that you pass to the resolution method should be one that has already been assigned (or that can be immediately assigned to a newly initialized actor instance); the resolution method then returns the actor instance that’s associated with that ID, which may be a proxy if the instance is located on a remote system.

It’s trivial to get the ID of a local actor instance that you’ve already initialized normally, but you’ll have to include some machinery in your transport system to communicate remote actor instance IDs across worker boundaries. There’s no universal method for advertising available remote actor instances that works well for every possible use-case, so this logic is left up to the transport system. Alternatively, you can have a worker spin up an actor instance on-demand when another worker tries to resolve a remote ID (“remote” with respect to the requesting worker, that is) that isn’t yet associated with a concrete actor instance.

Geordie_J · October 11, 2022, 5:40pm

That's kind of the issue I'm facing. I would love to just say: resolve me any actor of this kind, I don't care which one – if there's already an actor of that kind please hand it over, otherwise create one for me please. Importantly, this could be one of one, or one of many (i.e. I would like to allow a cluster of a specified size).

I could do that by arbitrarily making a random ID and expecting the DA System to make me an actor with that ID, but it feels more like a hack than a good architecture. And there would be no way to resolve an existing actor in that way unless I know which arbitrary IDs have already been assigned.

For now I will use the .singleton workaround, because for my personal use case I don't need more than one DA per type. But at this point I don't have a clear idea how I'd make this useful as a more general library, which may allow more than one actor.

edit: I guess I could periodically post the cluster's availability (and IDs) to a receptionist that is running on the "host" (i.e. on the JS main thread). I will think about it a bit more later, but for now will try to get the singleton variant working.

ktoso · October 13, 2022, 11:01am

Awesome work, really exciting

We caught up on the swift-server slack quite a bit about this, but I figured the writeup of generally how the create/resolve/remoteCall dance I prepared there might be useful in general, so posting it here for future reference as well:

spawning:

I don't know how much flexibility you have in API here, what web workers allow... but it'd be cool if:

let g: Greeter actorSystem.spawnWebWorker(Greeter.self)

To do this... there's a few things (inside spawnWebWorker impl):

You'd spawn a webworker

keep whatever the API gives you to "send message to that web worker"

make some "connection" object and store it inside the actor system (parent)

<< we ignore the fact that there could be many actors in a worker technically, so let's just hardcode things a bit for now>>

given this simplification... basically make up an ID "worker-1" or anything and store that "connection" as ["worker-1": <connection>]

that's what'll allow you to implement remoteCall soon (!)

since remoteCall gets passed "an actor" but it is always remote; and how you'll implement remoteCall is:

actor.id aha, that's whom we're trying to send a message to...

guard let knownConnection = myConnections[actor.id] aha, we do have a connection to this, so we know "where" to send this message...

we're not done with spawnWebWorker yet though, we want to return a proxy here; "Proxies" (remote distributed actor references) are obtained by Greeter.resolve(id: "b-1", using: self)

that'll "just work" since: this eventually calls into DistributedActorSystem.resolve(id:as:) which we basically implement as "is this an actor in my local system? (it is NOT)" so we return nil this will cause Swift to create a proxy object

now you got a Greeter remote actor reference and it's greeter.id is equal to "b-1" (that's how resolve works, it stores the "pointed at" ID inside the returned proxy actor)

we return this, we're done

Now, the time comes to try await greeter.hello() , since it is a remote reference, this will call into the remoteCall function of DAS: https://github.com/apple/swift/blob/main/stdlib/public/Distributed/DistributedActorSystem.swift#L193 passing itself as the actor.The remoteCall implementation is as follows:

get the actor.id

find a connection for the actor.id ("b-1") -- aha, we have one, this is the connection we stored before!

generate an UUID for the remote call

serialize the invocation somehow (JSON is good enough )

I recommend doing an MessageEnvelope type that'll have callID, recipientID ("b-1", this matters if you supported multiple distributed actors in the same target web worker) and the actual payload info.

send this message envelope to the identified target web worker, we're done!

The recipient process will have to do the inverse:

there's some API to "receive message", so you'll get that.

You know the message is MessageEnvelope so decode that and the remaining bits of the invocation.

get the envelope.recipientID and get the actor instance for it...

So when this process started basically you would have been passed this "b-1" and you'd just remember that... since we're doing that 1:1 worker to actor model...

TBH in this simplified model you don't need the recipientID at all... we just assume it is "that actor" I guess.

In either case, when this child web-worker spawned, it created the Greeter and the actor system must have stored it with strong reference inside it.

get the actor instance; it is "local" here

invoke executeDistributedTarget on it and pass the invocation decoder (that'll be some more preparation but you'll get it)

ktoso · October 13, 2022, 11:03am

Yeah, I agree with just doing a simple 1 actor in 1 worker step for now, and I can help out design a receptionist later on if you want -- there's some sync patterns we can utilize here

First let's make sure we have messaging round-trips going though. I'm very excited about this work, thank you for your efforts here!