SE-0344: Distributed Actor Runtime

Joe_Groff · February 22, 2022, 8:14pm

The review of SE-0344: Distributed Actor Runtime begins now and runs through March 8, 2022.

Reviews are an important part of the Swift evolution process. All review feedback should either be on this forum thread or, if you would like to keep your feedback private, directly to the review manager by DM or email. When messaging the review manager directly, please keep the proposal link at the top of the message and the evolution identifier in the subject line.

What goes into a review?

The goal of the review process is to improve the proposal under review through constructive criticism and, eventually, determine the direction of Swift. When writing your review, here are some questions you might want to answer in your review:

What is your evaluation of the proposal?
Is the problem being addressed significant enough to warrant a change to Swift?
Does this proposal fit well with the feel and direction of Swift?
If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?
How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

More information about the Swift evolution process is available at swift-evolution/process.md at master · apple/swift-evolution · GitHub.

Thanks for helping review this proposal!

Joe Groff
Review Manager

John_McCall · February 24, 2022, 6:56pm

4 posts were split to a new topic: Specifying the controlling actor system with distributed actors

icanzilb · February 23, 2022, 1:41pm

Very excited to see things moving forward.

This is kind of minutiae but I hoped the generics in the APIs could be made a little more unambiguous. For example in the "the full listing of the DistributedActorSystem protocol":

func remoteCall<Act, Err, Res>(...)

Act, Err, and Res aren't (common enough) abbreviations for what they represent here, and they aren't somewhat arbitrary assignments like T, S, etc. Not sure if there is another higher-priority style guide for these than Swift.org - API Design Guidelines (which calls against abbreviations) but it could make the APIs easier to discover if the generics were spelled out like "Actor", "Result", etc.

Just something that made me an impression while reading through.

xedin · February 23, 2022, 6:47pm

Absolutely! Opened [SE-0344] Adjust names of Actor, Error, and Result generic parameters… by xedin · Pull Request #1559 · apple/swift-evolution · GitHub to address the issue.

ktoso · February 23, 2022, 9:35pm

Not sure I like “Actor”, “Error” and “Result” for generic names because all those names are actual types we’ve now shadowed… there is Error the protocol, Actor the protocol and Result the struct as well…

hisekaldma · February 23, 2022, 10:21pm

Perhaps we can take inspiration from the Result type and call the error type Failure and the return type Success?

Regardless, I think the precedent is pretty strong for not using abbreviations. Besides the naming guidelines, there’s this.

ktoso · February 23, 2022, 10:24pm

Yeah that might be better.

xedin · February 23, 2022, 11:28pm

Adjusted via [SE-0344] Rename Error, Result generic parameters to Failure, Success by xedin · Pull Request #1560 · apple/swift-evolution · GitHub

John_McCall · February 24, 2022, 6:57pm

2 posts were merged into an existing topic: Specifying the controlling actor system with distributed actors

masters3d · February 26, 2022, 2:50am

I know that regular Actors are bound to ABI in apple platforms. Does ABI also apply to Distributed Actor runtime? It seems to me that one of the primary uses of Distributed Actor runtime is for use in server applications so I am wondering if for those cases we have to worry about changes that can cause ABI changes.

ktoso · February 28, 2022, 1:00am

We are looking into this right now actually. I don't have details to share yet but we should have a better idea in the coming days.

In general these will be bound by ABI eventually, but we're looking into trying to get us some more runway before declaring them ABI stable, given how many moving pieces (system implementations + language having to align perfectly) are involved.

ktoso · February 28, 2022, 10:11pm

Relating some threads:

This thread has relevant discussion to the review: Mangled names in SE-0344: Distributed Actor Runtime

xedin · March 2, 2022, 7:36pm

Regarding use of mangling since this is something I found myself thinking about recently. Since distributed thunks are denoted by a special symbol we could augment mangler and de-mangler to support a custom scheme i.e. drop distinction between class/struct/enum, remove parameter/result types etc.

ktoso · March 2, 2022, 9:42pm

That's very true, we have a marker that a thing is a dist thunk after all... Primarily we'd want to drop the struct/class distinction. The types we want there still;

Doing the full distributed accessor lookup just based on full name and then select the right one based on param types decoded would be separate work though in the future I think...

Hexley · March 3, 2022, 3:32pm

Exceptional work so far! I cannot wait to have distributed actors in the language. But I really want to dive in here on the ABI front.

Runtime ABI Requirement (Mangled Names)

TL;DR - Synthesized CodingKey-like protocol but for explicit method lookup on distributed actor types, regardless of version.

Problem Space

It seems we're looking at a few particular issues here:
1.) Usage of API that may bind a distributed actor implementation to an explicit ABI, and;
2.) Future desire for Versioning across Nodes/Clusters/DistributedActorSystem

Let's address these in reverse order, it is a lot but I promise we'll get somewhere.

2.)
As someone with an Xserve cluster still in active use, running OS X 10.11 El Capitan, I can say beyond a shadow of a doubt-- Versioning is a key part to making Server/Client functionality future-proof. Certainly, while thinking about it now serves a valuable purpose of planning-ahead; it is beyond the scope of this proposal, so I suggest we not make an ABI decision based upon it.
With explicit ABI-specific Mangled Name usage (anywhere in distributed actors) we actually hurt a potential future where we want true Versioning. (@ktoso, I've actually been developing a Versioning & Version Migration system to share with you privately, for a potential future proposal after the distributed actor reviews are completed. My developed approach addresses Versioning for distributed actors and Codable. -- Two Swift birds, one protocol-based stone.)

1.)
Locking implementation to an explicit ABI by Mangled Name boils down to being antithetical to the core tenets of Swift, these may be prudent or even necessary measures but ABI-specific Mangled Names and the Stringly-Typed nature of them is to-be-avoided (where possible) in Swift. [Side Note: (Probably best for @xedin) - Should we consider more elaborate diagnostics that warn with usage of ABI-specific API in future?]

But that's actually Safety at the core of the language. It gives the compiler insight into developer intent, removes Stringly-Typed Dynamic Dispatch at Runtime, and moves compatibility issues out of the language and into the developer's source code. Will we be able to remove them all? Probably not. (Generics, I'm looking at you.) Can we remove them for distributed func lookup? Entirely.

Potential Solution

Codable uses a protocol CodingKey to essentially represent KeyPaths in Decoding. And while not a one-to-one mapping the reason for their existence is clear, how can you decode a blob of data if you don't have somewhere to start? A CodingKey is necessary for Type-Safe Decoding. Likewise, distributed actors need a key of some kind to look up the proper distributed func, and since (for a myriad of reasons, including the above) we should avoid use of Mangled Name, we'll need a new type of key.

Let's call it DistributedMethodLookupKey, we'll need to make it at least Sendable and distributed actor will need to implement its synthesis for each distributed func:

public protocol DistributedMethodLookupKey /* Sendable */ {
    /// UInt8 gets us up to 256 methods we can look up
    /// plenty for our example
    var uintValue: UInt8 { get }
    init?(uintValue: UInt8)
}

Yielding:

distributed actor SimpleExample {
    
    // ~~~ Compiler Synthesized ~~~
    enum DistributedMethods: UInt8, DistributedMethodLookupKey {
        case accessible
    }
    
    // Internal Function Call Machinery
    func receivedDistributedCall(_ lookup: DistributedMethods) {
        switch lookup {
            case .accessible:
                self.accessible()
        }
    }
    // ~~~ Compiler Synthesized ~~~
    
    func notDistributed() {
        // ...
    }
    
    distributed func accessible() {
        // ...
    }
}

We can continue down this rabbit-hole, dealing with computed-properties, known-arguments and even known-return-types:

distributed actor ContrivedExample {
    
    // ~~~ START Compiler Synthesized ~~~
    enum DistributedMethods: UInt8, DistributedMethodLookupKey {
        case accessible = 0
        case accessibleWithKnown_argumentType // 1
        case accessibleWithMultipleKnown_argumentType1_argumentType2 // 2
        case accessibleWithMultipleKnownReturnTypes // 3
        case computed // 4
    }
    
    // Internal Function Call Machinery
    func receivedDistributedCall(_ lookup: DistributedMethods, arguments data: UnsafeRawPointer? = nil) -> AggregateOutputTypes? {
        switch lookup {
            case .accessible:
                /// No `arguments`, so never use them
                self.accessible()
                return nil
                
            case .accessibleWithKnown_argumentType:
                // Call Deserialization/Decode on `arguments data` to vend back instance of type
                guard let sentBool = try unpack(data, toType: Bool.self) else { fatalError("FATAL or THROWS!") }
                self.accessibleWithKnown(argumentType: sentBool)
                return nil
                
            case .accessibleWithMultipleKnown_argumentType1_argumentType2:
                // Call Deserialization/Decode on `arguments data` to vend back instance of type
                guard let (sentString, sentDouble) = try unpack(data, toTypes: [String.self, Double.self]) else {
                    fatalError("FATAL or THROWS!")
                }
                self.accessibleWithMultipleKnown(argumentType1: sentString, argumentType2: sentDouble)
                return nil
                
            case .accessibleWithMultipleKnownReturnTypes:
                let functionCallResults = self.accessibleWithMultipleKnownReturnTypes()
                let output = AggregateOutputTypes(functionCallResults)
                return output
                
            case .computed:
                return AggregateOutputTypes(self.computed)
        }
    }
    
    struct AggregateOutputTypes /* Codable, Sendable */ {
        // Include the called method so the `Recipient` knows what this `returns` from
        let methodCalled: DistributedMethods
        
        // Accumulate all known output types
        // Could be any number of instances of these types (or we could explicitly count them)
        let knownType1: [String] = []
        let knownType2: [Int16] = []
        let knownType3: [UInt32] = []
        let knownType4: [Int64] = []
        
        // Synthesized Initializers
        // .accessibleWithMultipleKnownReturnTypes
        init(_ r0: (Int16, UInt32, Int64)) {
            self.knownType2 = [r0.0]
            self.knownType3 = [r0.1]
            self.knownType4 = [r0.2]
        }
        
        // .computed
        init(_ r0: String) {
            self.knownType1 = [r0]
        }
    }
    // ~~~ END Compiler Synthesized ~~~
    
    
    distributed var computed: String { "" }
    
    distributed func accessible() { ... }
    distributed func accessibleWithKnown(argumentType: Bool) { ... }
    distributed func accessibleWithMultipleKnown(argumentType1: String, argumentType2: Double) { ... }
    distributed func accessibleWithMultipleKnownReturnTypes() -> (Int16, UInt32, Int64) { ... }
    
    func notDistributed() { ... }
}

Through this we've had the compiler unwind all of the possible known types of both our input and output arguments, eliminated all Stringly-Typed ABI-specific Mangled Names, put distributed actors inline to exploit various improvements coming to the very similar to Codable APIs. Including Versioning and Delta-Updates/Diffing, which would be a boon to distributed actors. This synthesis even removes the recording of a number of Existentials and Generics, as it allows for the synthesized code to hand back those same Existential and Generic types.

If you've made it this far, congratulations and my condolences.

This has been in my head for weeks, so I hope everyone gets some value out of this, it isn't my intention to slow-up work on distributed actors (to the contrary, I've been dying to really dive-in since "Scale By The Bay"). We really, REALLY, shouldn't let ABI-specific Mangled Names anywhere we can absolutely avoid them.

I didn't even touch on multi-year later broken APIs or the potential Security concerns about slinging around what Objective-C would call Selectors and the 37 CVEs of XPC's dynamic decoding. I'm happy to discuss those, if the forum thinks it's worth wading into, but for now I'll stop here.

It is my contention, that we need to break away from any ABI usage. As well as, making distributed actors more like Codable-- not in requiring it for Serialization, but rather; more-synthesis and staying as high an abstraction as possible.

If we got the more Codable route, we can even allow Swift's concept of progressive disclosure for more future customization points in distributed actors. Such as, Codables synthesized-by-default approach, while still allowing power users to conform the tooling to their needs.

Certainly, there is one issue with Codable that this approach would replicate. For lack of a better explanation, I've dubbed it The Needlepoint Problem; where a piece of code has function calls across a trace-only boundary. In the case of Codable it calls encode(_:) and moves back and forth between compiled language functions, synthesized data types and developer source code. Synthesizing DistributedMethods would inherit that minor issue.

xedin · March 3, 2022, 9:40pm

I think I understand what you are trying to say here but consider that "runtime accessible functions" mechanism that we have added to the runtime makes it possible to lookup a callable function based on some name, it doesn't have to be a mangled name. We chose a mangled representation of a distributed target name because it has several benefits - identifies targets unambiguously, has all of the information remote side would need to reconstruct generic environment and structure of the method, and accessor synthesis can derive it statically while generating code for distributed thunks. As I mentioned in my previous message - since we use a special symbol to encode names, it should be possible to adjust mangling and drop all of the unnecessary information or possibly even add new information to it.

Possibility of versioning is something that we have discussed extensively with @ktoso while working on both the isolation and this proposal and the difference between actors and i.e. services is very evident because serialization of a service is effectively a standalone request/response API and could be versioned separately but in case of actors it seems like the whole logical container (distributed actor) would have to be versioned somehow to archive similar results because request/response API is encapsulated into a logical unit. It doesn't look like there are any universal answers to this issue in existing implementations either, everybody tries to provide their own solution.

bobergj · March 8, 2022, 9:51am

Thanks ktoso for the in-depth answers in the mangled names thread. Now, here's my shot on an actual review.

What is your evaluation of the proposal?

Minus 0.5
Not because the feature isn't warranted, but because:
(1) I believe RemoteCallTarget mangledName property should not be public.
Instead I would suggest RemoteCallTarget be defined in the vein of:

public struct RemoteCallTarget {
   func withIdentifierBytes(_ body: (UnsafeBufferPointer<UInt8>) -> Void)
}

where the byte representation, which would contain the bytes of the mangled name, should perhaps have have a header containing a version identifier, to allow for future evolution.

(2) I think DistributedActorSystem and related protocols should not be in the stdlib module, but rather in a separate module, to not pollute the auto-imported namespace.

(3) I feel the proposal may be trying to do too much, and adds too much public API for the first revision of this feature - rather than focus on the core use-case of Swift to Swift distributed actors.
Specifically, my understanding is that DistributedTargetInvocationEncoder/Decoder was designed with distributed actors implemented in another language than Swift in mind (at least this was alluded to in the mangled names thread), and this may have complicated the design of these protocols. At the same time, in this iteration of the proposal it's not practical to implement a distributed actor peer in non-Swift, because there's nothing useful that can be done with mangledName in such an implementation - and that's the only machine-interpretable function identifier provided for the invocation.

Proposed simplification (warning, this is really a shot from the hip):
To successfully compile code that calls a distributed actor method, the call must type check against the locally available version of the distributed actor function definition. Then, rather than recording the generic substitution, return type etc and letting the DistributedTargetInvocationEncoder encode all of that, could the overload be resolved by the compiler at the caller end and encoded as part of the RemoteCallTarget identifier?
The compiler already "knows" how to encode these generic substitution and return types (the ABI mangled names).
DistributedTargetInvocationEncoder would then just have just the method

mutating func recordArgument<Argument: SerializationRequirement>(_ argument: Argument) throws

which is required to encode user-defined argument types.

(4) The reference implementation linked from the proposal is not adequate to evaluate this proposal, because it's simply too complex (many thousands of lines of code). I know it's solving a lot of concerns that has to be solved for a production-quality implementation, but an additional minimal reference implementation would have been nice.

Is the problem being addressed significant enough to warrant a change to Swift?

Yes, it's very cool to have remote procedure calls (if we remove the actor buzzwords that is essentially what this is </half-joke> ) that can make use the full Swift type system for the function definitions. Including generic arguments!
Yes, for communication between actors that are compiled together but executed in, for example, separate local processes.
I am not yet sold on using this for communication between network services. The industry has essentially standardized on a model where the service interface is defined in a implementation-language-agnostic manner (gRPC, Json schema etc), and code generated from there - and for good reasons. Note that in the pitch thread most of the negative feedback was on this point.

Does this proposal fit well with the feel and direction of Swift?

It feels Swifty.
The compiler synthesis aspects (the synthesis of the invocation serialization and the remoteCall) are bit too magic for my taste, although I understand why it's there.

If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?

Very familiar with distributed Erlang. This takes a very different design path though, so I have nothing useful to say here.

How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

Read the full proposal and took at look at the current implementation in the main branch. Browsed a bit in the swift-distributed-actors repo.

Hexley · March 9, 2022, 1:39am

Sorry to get back so late. (I’ve become daycare for a 1 year old nephew, he runs me ragged.)

Thanks @xedin, honestly seems the implementations both focus on the same goals:

Statically identify target actors and methods unambiguously
Secure, type-safe reconstruction of function arguments and return types

Regardless, of whether this is achieved in C++, IRGen, or Swift; these goals are quintessential and I think done well (waiting for PR-41616 to land).

ktoso · March 9, 2022, 9:41am

I'll reply inline to a few things quickly before the review gets summarized by @Joe_Groff.

I don't think we'll be able to run away from this, but as I mentioned before this is designed to be extensible.

We'll explore alternatives some more though. And give it a more detailed writeup.

It is not in the stdlib, it is in a separate module with the WIP name _Distributed and it is likely we'd rename it to just Distributed as the proposals get accepted. We will consider other names etc, but this will be a separate module like this, and not imported automatically.

No, we're very much focused on Swift-to-Swift but we deeply care about using different serialization mechanisms. Early adopters have voiced this as a specific need and thus we have to support Codable nicely by default, and allow all kinds of different serialization mechanisms -- the Encoder approach allows for this.

To state it more explicitly: Swift-to-nonSwift should be possible but limited in ability e.g. no complex generic and this is fine. It is not a goal of this work to define a vast cross language call paradigm. We are focused on making the Swift user experience fantastic.

That's not true -- if a recipient was "not swift" then they don't need to executeDistributedTarget they're not Swift after all. As I mentioned before, implementations can ship anything they want in envelopes to identify call targets, including just a string like "hello(name:)") this is supported today and would be how you'd integrate a non Swift other side.

Having that said, we are not focused on non-Swift "other side" but yeah it is possible to implement because of what I mentioned.

A non-swift callee would also never be able to implement generics semantics in the general terms, so the concerns about these are moot IMHO. Simple generics are implementable by storing the complete type information (or even just type String or ID), or the parameters. Nothing here is tied to Swift per-se. The generics are of course, but that is a specifically Swift-calling-Swift thing.

I will say though that I'd be interested in improving the mangling used; we could trim the class/struct information from there... We'll see what we can do.

Before we open sourced the cluster, raised concerns were "is it possible to write anything real with this?", so we open sourced the cluster as the reference implementation

There is a simple example implementation that matches what you're asking for over here: GitHub - apple/swift-sample-distributed-actors-transport: Distributed actors transport example, for feature review but I've not updated it to the state of this proposal yet. I should be able to do so soon though, and as we're likely to do another review round, that sounds good timing wise.

That is great to hear -- thanks! The way we started out with all this work many years ago was very "like erlang" or "like akka" and we truly through quite a transformation to make it all feel native and good in Swift -- I'm glad that it shows

bobergj · March 9, 2022, 1:27pm

In the proposal that envelope would be constructed in the remoteCall method, wouldn't it?

func remoteCall<Actor, Failure, Success>(
      on actor: Actor,
      target: RemoteCallTarget,
      invocation: InvocationEncoder,
      throwing: Failure.Type,
      returning: Success.Type
  )

The only information this function has on the invoked function is RemoteCallTarget which is defined as:

public struct RemoteCallTarget: Hashable {
  /// The mangled name of the invoked distributed method.
  ///
  /// It contains all information necessary to lookup the method using `executeDistributedActorMethod(...)`
  var mangledName: String { ... }

  /// The human-readable "full name" of the invoked method, e.g. 'Greeter.hello(name:)'.
  var fullName: String { ... }
}

Here, mangledName is opaque to the remoteCall implementation - it can't usefully switch on that value and generate a custom identifier. fullName is documented as "human readable", and doesn't have a defined format.
So I don't see how the caller side can do any customization here in practice.

Furthermore, on the callee side, executeDistributedTarget takes the mangledName as parameter.
If we are not sending the mangledName "over the wire", there's no practical way to reconstruct it on the callee side. I am saying practical, because re-implementing the internal mangling scheme in the actor system implementation isn't really practical.

Edit: Regarding the callee side part of this, I am talking about customizing the function identifier in a Swift-to-Swift implementation here, obviously in a theoretical non-Swift implementation, we wouldn't call executeDistributedTarget.