Exploration: Type System Considerations for Actor Proposal

Chris_Lattner3 · February 7, 2021, 6:40pm

Hi all,

I'm thrilled to see the progress being made in the design of the actor feature, but I'm concerned about a few of the implications of some of the details -- notably that actors will not be able to benefit from the full power of Swift protocol oriented programming and other key abstraction features. I think this is fixable with two minor type system extensions though, so I wrote up this document to explore the issues and the suggest fixes.

Much of the writing is the motivation section which attempts to explain the concerns I have. I'd love thoughts and feedback on this!

-Chris

anandabits · February 7, 2021, 11:41pm

This approach looks really interesting at first glance. Would it be possible under this design for an actor to create an instance of another actor that shares it’s concurrency context and executor under this approach? For example:

actor SomeActor {
    // other actor instance can be interacted with sync because they are statically known
    // to share the same concurrency context
    let otherActor: @sync OtherActor = .init()
}

I can imagine this having some interesting use cases.

Chris_Lattner3 · February 8, 2021, 12:17am

Not without resorting to unsafe extensions, since the result of SomeActor.init() has SomeActor (aka @async SomeActor) type. That said, with the ability to explicitly model @sync actors, you could use an unsafe cast of some sort to do what you're indicating.

anandabits · February 8, 2021, 4:10pm

How would this be safe? They wouldn't be guaranteed to have the same concurrency context and executor would they?

What I'm asking about is whether we could have a way to specify that an actor should share the current context when initializing an instance. If that were possible then the initializer would return @sync SomeActor.

bjhomer · February 9, 2021, 4:50am

Under this approach, if an actor type wanted to conform to CustomStringConvertible in a way that was available to non-actor-isolated contexts, is it possible to do so? For example:

actor BankAccount {
  let accountID: String
  var balance: Int
}

extension BankAccount: CustomStringConvertible {
  var description: String { return "Bank Account \( accountID )" }
}

func printAccounts(_ accounts: [/* @async */ BankAccount]) {
  for acc in accounts {
    // error: printing @async BankAccount requires 'await' 
    print(acc)
  }
}

This implementation does not access any of the actor-isolated state of BankAccount, but is still nevertheless unavailable. Is there any way to synchronously print the accountID of each of these bank accounts under this proposal?

Douglas_Gregor · February 9, 2021, 8:29am

I have a bunch of comments throughout, but not a whole lot of structure to them---so I'll go linearly. This also turned out to be rather long, so I'll summarize here:

I don't consider duplication of protocols for sync/async to be a significant problem, and where it is a problem it's better solved by a reasync counterpart to rethrowing protocols.
Actor isolation is not and cannot be "type-directed", because it is a property of values, but it is reasonable to want to be able to tag a parameter other than self as being "the actor-isolated parameter"
nonisolated is a fundamental part of working with actors and cannot be removed or replaced by "unsafe" mechanisms
nonisolated(unsafe) would be better replaced by nonisolated (on the declaration) and your proposed withUnsafeActorInternals (in the implementation)

On to the details...

Motivation

In broad strokes, I disagree with much of the provided motivation:

"Async requirements require duplication of protocols": I think you're extrapolating from two examples (Iterator and Sequence) to "many" without any reason to do so. Asynchronous code isn't just synchronous code with await sprinkled around; there should be a reason to use asynchrony. If not for ABI constraints, Iterator and Sequence could be better handled by something like a reasync counterpart to rethrowing protocols. The key point is that, with something like "reasync" (as with "rethrows"), you implement your generic operation based on the more complex model (async/throws) and it collapses down to the simpler one (synchronous/non-throwing) when arguments dictate. You can't go the other way.

*" nonisolated members are another "color" for your functions": I end up using nonisolated all the time when porting actor code. When one creates an instance of an actor, it's fairly common to have some bit of identifying and configuration state in the actor---the bank account number, the ID of a player, the address of a database, etc.---and that data is immutable by nature. It gets referenced a lot from places that don't want to be asynchronous: code wants to be able to access the bank account number to record it in a transaction, the ID of a player needs to be communicated with a server, and so on. Protocol conformances like Equatable and Hashable follow from this as well. This motivation section proposes that we don't need nonisolated, so long as (1) actors can fulfill sync protocol requirements and (2) we can unsafely poke into the mutable implementation. The solutions in this proposal don't really work safely for (1). And (2) is not good enough: it would mean throwing away the safety that comes with actors for something as simple as a computed property that derives its value from the aforementioned identity or configuration state.

"Actor reference handling is syntax directed, not type directed": You are absolutely right that cases like (self as IntActor).doSyncThing() could be permitted but aren't, and we could consider extending the rules to anything that makes a copy of self without modifying it. However, the result would have to be value-directed, not type-directed. Type-directed implies that all values of the given type have the same behavior, but this cannot be true: a function such as
```
func f(a: @sync IntActor, b: @sync IntActor) { }
```
can only make sense if a === b or a and b share the same serial executor. That's not a type system property, unless you're willing to say that every instance of IntActor shares a single, global serial executor. I can't imagine that's what we want, so we don't have a type-based property, we have a value-based property. It's fine to want to have the ability to tag a non-self parameter as the actor-isolated parameter, but that has to be a property of the parameter itself---it's not part of the type, and there cannot be two of them.

`@async` and `@sync` Actor types

@async actor types are just actor references, spelled IntActor or BankAccount or whatever. @sync actor types are really types; they need to be values. Ignoring that issue, it is not unreasonable for something like your useSyncActor(a2:) example to work with @sync as an indicator on the parameter that this is the isolated actor. I'll reproduce the example in part here:

func useSyncActor(a2: @sync IntActor) {
  print(a2.x) // okay, no `await` needed
  let a3 = a2
  print(a3.x) // okay, if we note that a3 is an exact copy of a2 by applying some flow-sensitive analysis
}

I don't find the above too motivating, but I something like the proposed withUnsafeActorInternals would be a reasonable extension. Earlier versions of the actors proposal had a "run a closure on an actor" operation that looked like this:

extension Actor {
  func run<T>(body: () async throws -> T) async rethrows -> T
}

but we removed it, in part, because it would be a lot more useful if we could say statically that we're working on the actor instance, e.g., to use your proposed syntax:

extension Actor {
  func run<T>(body: (self: @sync Self) async throws -> T) async rethrows -> T
}

That's a genuinely useful operation we could add whenever. Right now, you have to extend the actor to write code that's within its isolation domain.

I like withUnsafeActorInternals a lot, and it seems like a good replacement for nonisolated(unsafe), but it does not eliminate the need for nonisolated. Wherever one would write something like:

nonisolated(unsafe) func doSomething() { /* poke at mutable actor state */ }

it would be replaced with

nonisolated func doSomething() {
  withUnsafeActorInternals(self) { self in 
    /* poke at mutable actor state */ 
  }
}

That has the benefit of moving the unsafe code into the body, so we can have fewer concepts at the declaration level.

Revising actor protocol conformance

This whole section rests on the assumption of "type-directed", but it doesn't work. This example shows Equatable:

extension MyIntActor : Equatable {
  // Go and synchronously poke into mutable state, perfectly safe.
  static func ==(lhs: @sync MyIntActor, rhs: @sync MyIntActor) -> Bool {
    (lhs.x, lhs.y) == (rhs.x, rhs.y)
  }
}

As I noted previously, this cannot be safe unless all MyIntActor instances share the same serial executor. You have a later example of @asyncPromoted Equatable being ill-formed; that same reasoning applies even without generics or existentials.

`@asyncPromoted` protocol types

This feature projects a protocol with synchronous requirements into a corresponding protocol with asynchronous requirements. As noted previously, I think you've over-extrapolated from Iterator and Sequence, and I think that something like reasync protocols are a lighter-weight way to address the (IMO relatively rare) set of cases where a single protocol needs no changes whatsoever to become asynchronous.

Doug

Zhu_Shengqi · February 18, 2021, 2:10pm

I have question about this code snippet in the proposal:

func test(a1: @async MyDataActor, a2: @sync MyDataActor) async {
  // Error: cannot convert @async actor value to sync protocol.
  let existential1: DataProcessible = a1

  // sync -> sync is ok!
  let existential2: DataProcessible = a2

  // Ok, @async actors convert to @asyncPromoted generics.
  await genericExample(a1, a1)

  // Ok, @sync actors are also usable as @asyncPromoted generics,
  // because @sync actor values promote to @async actor type.
  await genericExample(a2, a2)
}

If existential2 is stored elsewhere and its methods are called from another thread, how could the compiler/runtime eliminate or detect potential data race?

Chris_Lattner3 · February 22, 2021, 5:10am

As I said above, this is only accessible with unsafe casts. Such an "unsafe" (in the "language guaranteeing safety" sense) cast can nonetheless be correct in practice when the developer knows something about the larger structure of the system, e.g. that two actors are scheduled on to the same serial queue.

bjhomer:

Under this approach, if an actor type wanted to conform to CustomStringConvertible in a way that was available to non-actor-isolated contexts, is it possible to do so? For example:
actor BankAccount {
  let accountID: String
  var balance: Int
}

extension BankAccount: CustomStringConvertible {
  var description: String { return "Bank Account \( accountID )" }
}

func printAccounts(_ accounts: [/* @async */ BankAccount]) {
  for acc in accounts {
    // error: printing @async BankAccount requires 'await' 
    print(acc)
  }
}
This implementation does not access any of the actor-isolated state of BankAccount , but is still nevertheless unavailable. Is there any way to synchronously print the accountID of each of these bank accounts under this proposal?

Nope, there is no way to do that - precisely because you can't allow code outside of an actor to reach into mutable state of another actor.

Great question: This isn't possible because existential2 isn't a ConcurrentValue, so it can't escape out to another thread. All of the proposals currently have a hole w.r.t. global state, which is unchecked. I have a proposal out for addressing that, and the other actor proposals have other suggestions on how to box these in. Assuming we take one or the other approach, we should be good.

-Chris

Chris_Lattner3 · February 22, 2021, 5:35am

Thanks Doug, no problem. I appreciate your time here. I'll likewise just do a quick point by point recap, but I agree that this is probably not the most conducive way to get to convergence.

That is a helpful step and I'm in favor of it or something like it, but the current approach is a big fork for the universe, and the subsequent AsyncSequence proposals are further unzipping the world, e.g. by adding Zipped, Enumerated, etc async clones of the sync types. This is all follow-on from the first async clone of a protocol, I'm concerned that this will play itself out across the ecosystem. We'll see though, there are many balls in the air so it is difficult to see how they all land.

But values have types. Encoding this behavior into a type (whether or not it is a first class type, or a parameter-only type like "inout") is strictly more powerful than being value driven. Perhaps I don't understand your meaning of "cannot".

I agree that the unsafe things are orthogonal, however, the nonisolated proposal does two things in the proposal: 1) it is the basis for a pretty problematic model for protocol conformance by actors, and 2) it is a sugary feature that allows eliminating some awaits in some cases.

I feel that we really need to address #1 somehow, and once we do that, I personally don't think that #2 is worth the complexity added to the language to support it.

Agreed, that would be a strict improvement.

As I mentioned, I'm generally infavor of something like reasync, but Slava has pointed out on other threads one major concern with this approach: unlike rethrows, there isn't a really good implementation strategy for reasync. Code with the "reasync" ABI has to support suspension, and calling into code that supports suspension from code that doesn't is... tricky. I'm not at all an expert on these codegen details though, so I hope this is solvable.

Yes, I can see your point about this being helpful modeling tool in some cases where you're dealing with a lot of immutable state in an actor. Here are my major concerns with this:

With the ConcurrentValue proposal, we will have a safe way to share bags of immutable state across actors: you can just have a final class with a bunch of lets in it.
The protocol model proposed on top of nonisolated is very concerning to me, because it doesn't support protocol extensions in a consistent way with the rest of Swift.
There is a lot of conceptual complexity introduced by nonisolated that would be nice to avoid if possible.

Personally, my priority is to get the protocol conformance situation squared away, that is my #2 goal and priority for the actor design now (#1 was memory safety with ConcurrentValue etc). My belief is that solving that problem has nothing to do with nonisolated. If that is the case, then nonisolated will just be a sugaring that allows avoiding some awaits in some narrow cases -- it seems like something that could be punted to a later proposal in that case.

Regardless, I'm very happy to see nonisolated(unsafe) go away (as you mention above) and for us to focus on the protocol design. If we can get to a reasonable design for protocol based abstraction, then we can see whether nonisolated holds its weight afterward.

While I agree that we need the ability to unsafely poke into mutable actor internal for the model to be fully baked, I don't think it would be acceptable to require its use to conform the general protocols. Said in a different way, protocol conformance / abstraction shouldn't require unsafe code, even if we allow it for advanced uses.

I think you and I are agreeing about the "we cannot require unsafety for protocol conformance" which is great. I would love to learn more why you don't think this approach provides safety though - it all composes and stacks out as far as I can see. If you have something specific in mind, I'd love to learn about it.

Douglas_Gregor:

Type-directed implies that all values of the given type have the same behavior, but this cannot be true: a function such as
func f(a: @sync IntActor, b: @sync IntActor) { }
can only make sense if a === b or a and b share the same serial executor.

Perhaps we're talking across each other here, but "type directed" doesn't say anything about the number of values that conform to the type. I agree with you that the above can only make sense if a === b or a and b share the same serial executor.

The reason this is useful is explained in other parts of the paper. The one case where "this example" is conceptually useful is if you (as a Swift programmer) know that you have two actors that share the same serial executor and do an unsafe cast to a sync reference. Such a property is absolutely possible with John's custom executor proposal, and having a way to work with such situations seems useful.

Right.

I don't understand what this means. Can you please explain with an example instead of just stating this?

Right, I agree that this is a very important operation and that it isn't supported without something like @sync actors. I feel like you're agreeing here, but I'm not sure given how you're wording this and arguing against specific examples as "not too motivating" above. For clarity, my rationale for giving many different examples was to illustrate how the type system composes and works out: it wasn't to prove all the small examples are "motivating" or "useful".

Agreed, I think this would be a strict progression.

The white paper also mentions that the Equatable example isn't actually very useful. However, your statement above shows exactly why such a thing "theoretically could" be useful: we allow actor instances to be pinned to the same serial executor, and should support unsafe casts so the programmer can tell the language about that. In such a world, you'd have multiple instances of a @sync actor type floating around. I don't think this would be widely utilized, but for code that does this, we should provide the ability to work with this in a memory safe way.

That said, to reiterate, I included the Equatable example to show how the type system mechanics work, I don't think it this is a generally useful thing to do for reasons stated in the paper.

Coming back to the top of the writeup, I would see it as good progress if nonisolated(unsafe) were to go away. I would really love to know your thoughts about the protocol conformance and abstraction problems discussed in the motivation section. I consider them to be a showstopper for the actors design.

Once that gets addressed, my objection to nonisolated is just that it adds a bunch of complexity for incremental expressive value. It seems like something that could be punted to a future proposal or a future release of swift when we have more experience with the actor model and agree that this problem is worth addressing by adding more complexity to the language.

-Chris

Exploration: Type System Considerations for Actor Proposal

Motivation

@async and @sync Actor types

Revising actor protocol conformance

@asyncPromoted protocol types

`@async` and `@sync` Actor types

`@asyncPromoted` protocol types