Terminology questions - behaviors, shells, and possibly reductions

Joseph_Heck · November 11, 2021, 12:12am

I'll caveat this with "Yep, these could be all implementation details". In starting to dig through the source for swift-distributed-actors, there are some things I knew the concepts for (basic actor concept, mailboxes, and messages). The distributed actors code steps it up a bit, and uses some terms and phrasing that I wasn't very familiar with. I could guess and infer quite a bit, but I thought it might be best to ask about the specific terms and how they're inter-related.

I started at ActorSystem, thinking that might be the "top of the tree" for the implementation details (based on its use in the Dining Philosopher's sample) and STYLE_GUIDE.md since it seemed to have a bit of glossary included within it.

From within the style guide, there's a reference to a distributed actor processing messages from its mailbox being called 'reduction' and talk about how that applies a behavior. So an Actor's behavior seems like a pretty key concept. So generally, what is a behavior, and how does it relate to actors?

From the code in Behaviors.swift, the abstract for it reads:

A _Behavior is what executes when an Actor handles messages.

Is a behavior something that a developer would provide, or is it more of an implementation detail of how distributed actors manages its own data models and execution? I suspect it's related - but just to verify - the 'shell' concept is for the system actors to react to messages from other distributed actors and defines some concrete ways that it handles the local ActorSystems state, is that correct? That the implementation of the local distributed system's internal actors handling messages was a 'shell'?

ActorSystem has a public method park that calls down into a transport to do its work. What's the implication of parking an actor system - or more generally what's that intended to do? In the sample, one of the ActorSystems is parked with that looks like a deadline concept - is that implying "run as you need to for at least X amount of time?" or am I misinterpreting that? Are there other implications to "parking" an actorSystem?

The ActorSystem has a systemProvider and userProvider with a TODO note to converge into a single tree (which is noted to be different from what Akka does), and elsewhere there's references to tree's of Actors. What does organizing the actors into trees enable, and is that a pattern that someone writing a distributed actor should know (and potentially replicate), or is that an internal detail of enabling 'watchdogs'/'supervisors' within the swift distributed actor world? (I'm guessing purely an internal detail, given they're both private - but I was curious about the history there.) I also noted that a few of the tests referenced an actor path string, which I suspected might be a string representation of that tree concept in order to identify a specific actor.

The Cluster concept looks like it's a representation of the state of a set of distributed actors that are working together, exposed primarily through the ClusterSettings - after which it does its thing according the implementations. Is that something with exposed hooks anywhere, or intended to be interacted with from a developer's implemented distributed actor? The sample was interesting in that it interrogated the various systems until it was happy with its status before continuing execution.

Scanning multiple source files, I saw several references to plugins - which looked like some (as yet unclear) places the code was explicitly written to be extended with implementation specific behavior - but I wasn't sure what was intended to be influenced with the plugins. What's the current thinking there, or is that a bit of legacy experimentation that's better left alone/ignored for now?

Last question - what's SACT stand for? I'm guessing "Swift Actors" but wanted to ask, since its used as a fairly common prefix.

ktoso · November 11, 2021, 2:00am

That's exactly the case All these things you mention are internals that predate Swift Concurrency, and we are in the process of porting them over to the language provided (distributed) actors.

Current state of the Distributed Actors (Cluster) internals

Short version: What you are looking at is runtime internals long predating Swift Concurrency, and effectively is a complete "library only" actor runtime; We will be removing most of these internals, as they have either become obsolete in face of the runtime now offered by the Concurrency library, or they are being replaced by the work in progress distributed actor feature.

So, why did we share this library early, while there still are those pieces of the old infrastructure in there...?

After careful consideration we decided to release this project as work-in-progress to help the public review process of the distributed actor feature (as we mention in the annoucement blog). During the initial pitch of the distributed actor language feature it became clear to us that it will be very hard to explain and justify exact details of the language design, without having a reference implementation associated with it, so we're able to explain exactly why and how those features are used.

Remember that the language feature basically "does nothing", it just is a bunch of hooks for runtimes such as this one to do their thing. So it can be pretty hard to grasp exactly how it's supposed to work, without having a reference implementation at hand.

Thankfully, we had this runtime since quite a while, and it is a pretty complete distributed actor system - only pending a full "rebase" onto the language features, some of which are still evolving. Since this project is server-side focused, and does not carry any specific internal or product related secrets, we were able to open source early, allowing the community to participate in the design and evolution more actively -- avoiding weird situations where we would have to push against Swift Evolution feedback, without being able to specifically articulate why exactly. With this lib being open, we can point at specific implementations and how the language feature is used "in reality"

Summary

So, should you care about: _Behavior, _ActorRef, shells etc...? Not really, they are both internal and going away as soon as we're able to express everything we need using the Swift provided language features

Having that said, it is an interesting case study how actors would have looked like if they were just a library. We did multiple implementation attempts, from functional Behaviors (the most true to the actor model described in computer science literature, however also quite verbose), to actors driven by source generation. I want to mention this because during the feature review a few times it was mentioned that we could "just" implement distributed actors as a library, and the reality is that we tried, many times, and the results were not satisfactory, leading to the current distributed actor Swift Evolution proposals

We can nevertheless discuss the internal design, if you're curious, I'll post a follow up post answering your questions right away, but for anyone else reading -- none of these matter in order to understand or use distributed actors as we are proposing their way forward

ktoso · November 11, 2021, 2:33am

Discussion of previous to-be-removed internals of the distributed runtime

I'll put this discussion into a "spoiler" block, because none of this really matters for where we're going with the distributed actor feature, and all the mentioned types and things you asked about are either internal, private or _Underscored and to be removed.

Appendix: "Discussion of existing, to-be-removed distributed actor runtime *internals*:

You're right that this is the "top" type. In the distributed actor proposals we call this the ActorTransport but I think the name will make a comeback, because "transport" was very confusing to some reviewers, because it is more than that.

In upcoming swift evolution proposals I believe we'll rename the protocol ActorTransport to protocol DistributedActorSystem. The library's ActorSystem is an implementation of this protocol (from the _Distributed library in Swift).

Behaviors originate from the original definitions of actors in literature. In literature, an actor can only do three things:

create more actors
send messages
change its behavior in reaction to an incoming message

The "change its behavior" in Swift Actors is simply this:

actor Counter { 
  var counter: Int = 0
  func add() -> Int {
    counter += 1
    return counter
  }
}

It means that the actor is stateful.

The behavior model takes this very explicitly, and you literarily "become a new behavior" every time you receive a message:

func behavior(counter: Int) -> _Behavior<Message> { 
  _Behavior.receiveMessage { message in 
    // for every message we receive, we just add 1 to the counter and become that behavior
    return behavior(counter: counter + 1)
  }
}

So behavior actors are strictly state machines, while swift actors are more normal reference types where you just modify your state in the instance, and that is the "behavior change". Behaviors are pretty powerful in how they can be composed, but that's a whole different topic. You can imagine having a "door actor" that becomes a "closed behavior", rejecting attempts to close it, and then becoming the "open behavior" -- this is like swapping the implementation of functions in a Swift Actor, which we can't do, but instead would model it by doing a switch self.state { case .closed: ... case .open: ... } so we're able to express the same things, but the behavior model was really pushing you towards "think in state machines", though it becomes pretty verbose pretty quickly.

You're right that we have a few "shells"; they are just a pattern though, compensating for lacking language features (which we have solved now, thanks to distributed actor ).

You'll notice that behaviors must define a message type they work with. Usually this is some enum like this:

enum Message { 
  case add(Int)
  case getStatus(replyTo: ...) // since it's messages, we can't just "return from a func"
}

and the actor is implemented in terms of receiving messages... so we need to write code like this:

.receive { message in 
switch message { 
case .add(let amount): <something>.add(amount: amount)
case .getStatus(let replyTo): replyTo.tell(<something>.getStatus())
}

so that logic has to be somewhere... and as you can see it's pretty boring "make messages into function calls". But, thankfully this is what actors built into the language already do for us! Instead of phrasing everything as such switch, we can "just call add(amount:)"!

The "...Shell" concept is the type that contains the actor behavior and the that it delegates the calls into. It really is mostly boilerplate and compensating for the lack of actor isolation in a library only model.

All this is unnecessary now thanks to the introduction of actor and distributed actor and actor isolation in the language!

We just have not yet gotten around rewriting the big complex actors into this new world, since we're missing a few language features still.

Summing up:

lib. only / old	language integrated actors
`_Behavior`, to hold dispatch logic	logic inside func in an actor
`_ActorRef<Message>`, to hide state from outside callers to enable thread safety	normal variables storing an actor; actor-isolation enforces thread safety

There's a few more concepts but generally, the theme is that all kinds of "type dance" was necessary to hide and protect state from unsafe concurrent access, while now we're able to rely on the compiler to do this for us on actor types instead

Parking the system is very "boring" and basically just blocks the calling thread until the system has been shutdown(). It is a way to write code like:

main () ... { 
  let system = ...
  // kick off many actors
  // eventually call shutdown()
  try system.park()

  // ok, system it shutdown, proceed with shutting down the entire process
}

Needless to say, this should go away entirely because we have both async await and this would not become await system.park().

Kind of, though not actors but nodes. You'll notice the Cluster is basically a special actor that handles all the other nodes connecting, disconnecting, ensuring we have connections alive and can talk to all other actors. It powers the system.cluster.events eventstream which provides you with events like joining/up/down about other nodes.

This piece of the design will remain unchanged in the distributed actor world. We offer the ClusterControl type which you can issue commands to, like cluster.join(<some node>) etc. So it is the type allowing end users to interact with the cluster directly on a "node level" rather than specific actors (which the receptionist is for).

We have one "plugin": the ActorSingletonPlugin. Plugins are something that starts as the system starts, and automatically can do some tasks.

The actor singleton is able to best-effort (long story about the details which we'll document) guarantee a single instance of a specific actor type is run in the entire cluster. If the node hosting the singleton dies, it is started on a different node, and other nodes automatically realize that the singleton is now hosted in a different location.

Plugins in general are just "start during system startup" and "stop during system shutdown". I'm not sure we'll need these anymore as we move towards the distributed actor world, we'll see.

Yes, Swift Actors -- we just needed some nice protocol name to use for our identities, and so sact:// was born. There isn't much meaning to it other than "our custom wire protocol that we use for the actor messaging, and wanted to have some short name for it." sdact:// didn't sound so nice

Actor identities in the distributed actor system are URI like "addresses" and therefore they also include the protocol over which they are exposed; in this case, they're all communicating our custom cluster protocol (not stable yet), and we just called it "sact", not much more to read into it. If people wanted to use a similar identity scheme you could imagine ws:// or https:// identified actors, so it's the same idea.

Hope this answers your questions!

But at the same time, don't worry too much about these, the focus truly should be on distributed actor and the ActorSystem being an implementation of a transport / distributed actor system.

Joseph_Heck · November 11, 2021, 3:20am

Thank you, I appreciate the time you took to explain this, especially as most of it is backwards looking.

It's brilliant for me to piece together the background/history lessons in this as it steps forward into distributed actors. The matching from the historical library version to the new language features made a lot of the pieces make MUCH more sense, and helped me clarify that ActorTransport isn't just the transport component (which I totally fell for earlier), but instead a more general thing that encompasses serialization, cluster integration, as well as the message handling details.

ktoso · November 11, 2021, 3:25am

Glad you're digging through and piecing things together! Thanks for the questions and I hope to hear from you during Swift Evolution proposals of these features We'll be cutting it up into small step-by-step proposals very soon, in similar spirit to how Swift Concurrency was multiple proposals centered around one "big" feature.

Right! Now in retrospect I realize our naming there in the initial pitch must have been quite confusing.

I originally thought to avoid the phrase "actor system" since even the local actors are in colloquial terms an actor system -- it's simply a group of actors doing things together. You're not the only reviewer which got mislead by this, so I think moving forward we'll be calling this component protocol DistributedActorSystem since that's what it is responsible for really, just the distributed actors.

Indeed it is the piece that contains transport, serialization, and anything it needs to resolve and manage those actors for the purpose of distribution.

hassila · November 11, 2021, 7:34am

+1 for that - I think many people (myself included) makes the assumption that it would be a pure transport layer - naming does help in understanding (cf. the discussion about 'Date'...).