Actor structs?

When the first concurrency proposals came out, I think I was too quick to assume that an actor without reference semantics was a nonsensical idea, but my recent discussions with Sean Parent suggest otherwise. (I realize that actors in the literature have always been reference types, but that was always true of arrays before Swift came along too, so unless someone has a better word, I'm going to continue to use “actor” with a lowercase a to mean something like “type with an attached serial queue”).

In particular, an actor with value semantics would be both deadlock-free and free of the logical race conditions that—even where data races have been ruled out—come with reentrant access to shared mutable state.

To see that the idea is potentially useful, imagine an editor application holding an Array<Document>, where Document has value semantics and exposes its editing operations as mutating methods. Now suppose we need to add an editing operation that can easily take longer than one would want to block the user interface from processing events. We could turn the Document into a (value-semantic) actor and solve the latency problem. Obviously providing a good user experience is more involved than simply slapping @actor on Document, but the point here is that whether a type is a useful actor is not inherently tied to its state being shared.

Given all this, it seems like a missed opportunity to have the only route to concurrency in Swift be through actors with reference semantics. I thought I should raise the idea here to see if anyone is interested in discussing the possibilities and implications of “actor structs.”

15 Likes

I don't think the ideas compose very well. If I have the following code:

var a = Document() // a struct
var b = a
async let aResult = a.mutate()
async let bResult = b.mutate()
await (aResult, bResult)

does b = a mean that a and b are two separate actors, or just one? That is, are these mutations serialized or not? If they are one actor, we have reintroduced some form of reference semantics. If they are two, providing serialization would violate the formal notion that mutations on value types are copy-in/copy-out (some day to be move-in/move-out).

EDIT: forgive my semicolons (now removed). I've been writing Rust.

EDIT 2: I blithely wrote "move-in/move-out" because it's a feature I want, but I realized that would resolve the contradiction. I still think the semantics become a little weird, though—what if one of the pending tasks at the time of a copy has side effects? We've seen this a bit already with lazy in structs, and I don't think we want more of that behavior.

3 Likes

The former, of course. It's value semantics: two distinct variables are always independent of one another. a and b are two separate documents.

That is, are these mutations serialized or not?

The mutations are serialized after any previously issued mutations to a, but not serialized with respect to one another.

If they are one actor, we have reintroduced some form of reference semantics. If they are two, providing serialization would violate the formal notion that mutations on value types are copy-in/copy-out (some day to be move-in/move-out).

I don't see any problem here.

what if one of the pending tasks at the time of a copy has side effects?

The semantics have to be “as-if” the copy is only made after all the pending tasks have executed. That could be implemented at least two ways I can think of: the current task is suspended at the point of the copy until a's queue is empty (crude) or the system makes it so b can't be read until a's queue has caught up to where it was copied.

I don't see how this brings up any unique problems with side-effects.

7 Likes

…you're correct, I withdraw that part. Even without moves the "copy-out" takes care of any hypothetical formal concerns I'd have about "copy-in".

1 Like

This could be an interesting addition – your point about arrays traditionally being reference types but value types in Swift is right to the point. Furthermore, there is precedence for value type actors: Rust's Actix framework. Could anyone with experience with this framework comment on the pros and cons of having value type actors?

Also, I think it could compose nicely with other "keyword actors", such as the upcoming disitributed actor: I would suggest naming it value actor.

Would we allow a distributed actor to also be a value actor? Would that even make sense? Or would we only allow a value actor to be a "pure" actor?

When I saw the title, I had a knee-jerk reaction, thinking, "but actors need identity, because they're mailboxes". Now after reading the post, I think it is reasonable to have actor structs. The "mailbox" metaphor is just one approach to the more important "island of serialization" metaphor. And if explicit identities are needed, there is always Identifiable.

If I understood correctly, it is actually "actor variables", not "actor structs". It makes no sense to be an actor for anonymous values - returned from functions, for example.

actor var a = Document()
actor var b = await a // You cannot simply read a without await

being equivalent to:

actor Box<T> {
    var value: T
}

let a = Box(value: Document())
let b = Box(value: await a.value)
4 Likes

Actor variables make more sense to me than actor structs. It's a bit like designating a variable to be protected by a mutex, or a queue, or... well, an actor. Pretty close to tagging the variable with @MainActor, except it wouldn't be a global actor but a special actor only for this particular variable.

Could this be implemented as a property wrapper?

1 Like

Extending @jrose's example, how does this interact with the law of exclusivity? When does an async mutating access to a value-typed actor begin and end? How is this enforced at runtime to prevent re-entrancy?

In the hypothetical example above:

I think the question of law of exclusivity matters a great deal here. To take the idea forward, let's presume the following API:

actor struct Document {
    // private fields

    mutating func modify() async throws {
         // Body is irrelevant
    }
}

During the call to modify, I cannot access the value of Document, either as a reader or a writer. That means I cannot redraw my UI without having taken a copy of Document before I began the modification. This seems a little unwieldy to me.

With that said, I don't inherently oppose the idea: it just seems to me like a somewhat awkward fit. I certainly think it's worth exploring the design space, though.

swift? really?
PROGRAM PASCAL;
VAR a: ARRAY[1..10] OF INTEGER;
VAR b: ARRAY[1..10] OF INTEGER;
BEGIN
  a[1] := 123;
  b := a;
  a[1] := 456;
  writeln(a[1]);
  writeln(b[1]);
END.

456
123

5 Likes

Thanks for the reference to Actix; that's very helpful! I don't really know anything about the upcoming distributed actor; is there a place I can read about it?

In fact it's a common misconception that values don't have identity. We give them names all the time. The difference between value types and reference types w.r.t. identity is that if you do not have (exclusive) ownership of a value, you can't express its identity in absolute terms. We can express the identity of a value relative to some other value of which it is a part, that we might not own, by using key paths.

No, I meant what I said.

It makes no sense to be an actor for anonymous values - returned from functions, for example.

I disagree that it makes no sense for actors to be returned from functions, and can't understand why you'd make that assertion. They stop being anonymous the moment they are bound to a variable or a parameter, and if they are discarded, well, no harm done.

That really depends on two things:

  1. The implementation of actor copying: it needn't be a suspending operation if implemented in the second way I suggested here
  2. The context: As far as I can tell, await itself has no meaning whatsoever inside an actor with value semantics, since the actor can't be re-entered. So it's not obvious to me that every async call needs an await even though it suspends.

This is simpler than it probably looks. Outside the actor, it is handled just as with any other value.

You're leaving out some context, since I don't see a call to modify here and I don't know which code “I” is…

Because the Document is a value, yes, the only code that can read it during a call to modify is code that is directly or indirectly called by modify, which exclusively owns the value while it is executing. That was always the case even with no asynchrony in the system. In fact, it seems to me that's effectively true even if Document were an actor with reference semantics, unless you don't care about the consistency of the various things you might read from it. Reentrant actors are not like ordinary reference types: they protect you from data races, but are much more prone to logical races, and there's no guarantee that consecutive reads without visibly intervening writes are observing the same state.

Note that with async, because modify returns Void, there's no need to suspend a caller that issues a call to modify.

Whether this arrangement is unwieldy is certainly a matter of opinion. Making a copy of Document is as simple as initializing a variable or appending it to an Array. All of Photoshop works this way; the undo history is just a series of document snapshots, and because it uses copy-on-write at multiple levels (just like most complex data structures in Swift), these copies are cheap. Any data that needs to be read while an editing operation is underway is read from something that isn't being edited.

With that said, I don't inherently oppose the idea: it just seems to me like a somewhat awkward fit. I certainly think it's worth exploring the design space, though.

Programming with value semantics definitely changes the way you approach problems, but the payoffs in terms of being able to understand what your code means, control its behavior, and even improve performance are huge. Most people never consider pushing value semantics to its limits, but the approach has proven itself in the applications with the strictest performance and user-experience demands.

6 Likes

Wow, @tera, thanks for the reference! It's been so long since I programmed in Pascal that I forgot about that… or maybe I never knew. I wonder why value semantics of Pascal arrays didn't get more traction in the literature… Regardless, this info is going to be super-useful in efforts I am participating in to correct that problem.

/cc @saeta @Alvae @shabalin @dan-zheng

4 Likes

and i wouldn't be surprised if Pascal copied that behaviour from its predecessors Algol and PL/I

Interesting precedent. I'm curious, do you know whether it's possible to copy an Actix actor?

Reading the documentation, Actors are just a "trait" in Actix (a protocol in Rust), which a struct then implements. I don't have much experience with Rust, but AFAIK structs are value types in Rust much like in Swift, and so have the same copying behaviour.

Now whether an Actix actor should be copied is a different story. Looking at the code, it looks seems to me like Actix actors are more similar to Swift's upcoming distributed actor that @ktoso is working on. These kinds of actors are closer to the "real" actors created by Carl Hewitt and implemented by the likes of Erlang and Akka, which have unique addresses, and can be destroyed and created remotely. Copying these kinds of actors doesn't make much sense, since you'd end up with two identical remote endpoints with the same address.

You can find some more information about Swift's upcoming distributed actors in @Chris_Lattner3's Concurrency Manifesto, and by looking at the code in the "Distributed" directory of the standard library.

1 Like

structs are value types in Rust much like in Swift, and so have the same copying behaviour.

structs in Rust have move semantics by default; if struct implements traits Copy, Clone then the struct has copy semantics.
In Swift structs have copy semantics by default, but Swift has no move semantics for value types(structs, etc.) currently.

4 Likes

I see, sorry for the mistake. I don't see anything in Actix's Actor trait implementation which implements the Copy or Clone traits – I guess you can't copy Actix Actors by default after all. It seems like that would be an anti-pattern anyways, since as mentioned previously Actix actors are fully distributed / addressable.

2 Likes

Near as I can tell, “distributed” means an actor's code may execute in a different process or on a remote machine, implying higher latency in messaging and the possibility of recovery from communication failures (thus throwing). I can see no reason why either property would be incompatible with value semantics.

1 Like

I wrote a lengthy response to the law of exclusivity stuff (blah blah re-entrancy blah blah cannot see the duration of the access from the source blah blah) but ultimately I think this is the place I came to on my own as well. I love Swift's emphasis on value semantics. I cannot mentally "see" how well an actor struct would work in practice, so all I have are vague concerns. But these vague concerns shouldn't prevent an exploration of the space: after all, maybe it'll be great.

So my position is "cautious hopefulness". I'd like to see someone with better compiler chops than me explore this space.

5 Likes

What you are describing here (aside from it being useful or not) are fundamentally not actors. One of the key properties of actors is that they can change their behavior, which can be expressed in different ways (e.g. changing the actual handler, mutating state etc.), but always means that the next message that is being processed by said actor will be handled with that new behavior. If you make them value types, an actor will never change its behavior. Instead it will spawn a new actor with different behavior, which is a separate property of actors. If you look at other actor implementations out there, they typically separate the handle from the state. In Erlang you don't pass around that actual actor, but rather its PID. In Akka you pass around an ActorRef. The messages are forwarded to the actual instance and processed with the isolation guarantees. There is no way to access the actor state without sending a message and therefore it wouldn't be possible to copy it from the outside either. This is not simply an effect of how they are implemented, but very much intentional. One of the main purposes of actors is to encapsulate mutable state.

1 Like