[Pitch #6] Actors

Douglas_Gregor · March 10, 2021, 6:43am

Hi all,

Hello again, everyone,

After yet more interesting discussion in pitch #4 (and 5), we've revised the actor proposal for this, pitch #6, addressing more feedback.

Changes in the sixth pitch:

Make the instance requirements of Actor protocols actor-isolated to self , and allow actor types to conform to such protocols using actor-isolated witnesses.
Reflow the "Proposed Solution" section to get the bigger ideas out earlier.
Remove nonisolated(unsafe) .

Changes in the fifth pitch:

Drop the prohibition on having multiple isolated parameters. We don't need to ban it.
Add the Actor protocol back, as an empty protocol whose details will be filled in with a subsequent proposal for custom executors.
Replace ConcurrentValue with Sendable and @concurrent with @Sendable to track the evolution of SE-0302.
Clarify the presentation of actor isolation checking.
Add more examples for non-isolated declarations.
Added a section on isolated or "sync" actor types.

As always, comments welcome!

Doug

michelf · March 10, 2021, 12:13pm

From the Pitch #4 thread:

I agree isolated is confusing here. While direct isn't bad, I'd suggest sync:

func f(a: sync YourActor)

It's easy to explain: it removes the implied async of isolated members.

BigSur · March 10, 2021, 12:24pm

Ditto

anandabits · March 10, 2021, 2:07pm

Can you explain this part? I thought it had to be banned because a function can’t run in two concurrency contexts at the same time.

BigSur · March 10, 2021, 2:39pm

After re-reading the latest proposal, it'd be better to merge/simplify isolated/nonisolated notion. They are basically the same sync/@sync semantic from different(inside/outside) perspective/view for Actor.

From outside of Actor, isolated Actor means switching to actor's executor to access stuff in sync way.
From inside of Actor, nonisolated Actor.member means conform to non-isolated protocol requirements in traditional sync way or access stuff in-sync from anywhere.

So all in all, outside isolated Actor == inside nonisolated Actor.member == sync / @sync in semantic; there's no need to use two different antonym words to express the same thing.

They are just plain sync semantic meaning.

bjhomer · March 10, 2021, 4:02pm

I'm somewhat concerned about the restriction that "a key path cannot involve a reference to an actor-isolated declaration." The rationale is that such a key path would permit accesses to the actor's protected state from outside of the actor isolation domain. However, this restriction also means that actors cannot use key paths at all, even internally.

It seems to me that an actor-isolated key path is only unsafe if used on an non-isolated actor instance. For example, is there anything unsafe about the following code?

actor Account {
  var accountID: String
  var balance: Int

  func access<T>(path: KeyPath<Account, T>) -> T {
    return self[keyPath: path]
  }
}

let balance = await account.access(\.balance)

For that matter, is there anything unsafe about this?

let balance = await account[keyPath: \.balance]

It seems to me that using a keypath (with thing[keyPath:]) requires actor isolation, but it's not clear to me from the proposal why simply forming such a key path should require isolation.

Douglas_Gregor · March 10, 2021, 5:50pm

With custom executors, it will be possible to make different actors share the same executor, putting them both in the same concurrency context.

Doug

anandabits · March 10, 2021, 5:58pm

Oh, that's awesome! How will the compiler know whether two isolated parameters share the same context or not?

Chris_Lattner3 · March 10, 2021, 6:16pm

Thanks Doug, I'm glad to see nonisolated(unsafe) get subsetted out of this round of the proposal. Much of my feedback from the 5th draft last night stands, including:

Actor protocol should be named AnyActor for consistency with AnyClass and other type erased things in Swift.
isolated self doesn't really work for the same reason we don't use mutating self in structs.
nonisolated still makes the proposal much larger and more complicated than it should be, and is an additive feature that can compose on top of the basic actor model. nonisolated also undercuts key future directions for actors like Distributed actors.

After sleeping about it, I have a meta concern about your new approach with "Actor protocols". To restate common ground: I think that we generally agree that actors force a dual nature: there is the "outside" and "inside" the actor viewpoint of things, often seen by the "client" and "implementation" logic of the actor. We need to represent this complexity somehow, and your proposal is to model this division in the protocol definitions. There are other proposals, including modeling this as part of the conformance isolated MyProtocol vs nonisolated MyProtocol, and modeling this as part of the actor type var x : @sync MyProtocol, which are all different ways to factor the complexity with different tradeoffs.

After sleeping on it, I have a high level concern about your new approach for two reasons:

It is basically putting the complexity into library and API world instead of modeling it in the language/type system. This is a concern to me because the library ecosystem is FAR larger than the language ecosystem, and is the bulk of the complexity that Swift programmers have to learn in practice. Given that this is fundamental to the nature of actors, it seems appropriate to model it in the language, instead of requiring protocols to get duplicated across the library ecosystem.
The existing protocols in Swift serve a dual purpose: both composing in implementation behavior as well as providing public interface logic for types (I understand this may be regretted by a few, but it is the undebatable way Swift works and there are no proposals to change this). Breaking this duality for actors seems like it will require duplicating protocols to model this dual nature in the cases where we're composing in public behavior. This is going to drive a lot of boilerplate and Foo vs FooActor protocols patterns.

More generally, your proposal describes a new approach, but I haven't seen an argument or rationale for why you think it is better than the other two approaches. Can you share your thoughts and elaborate on that?

I haven't had time to do a detailed read through of your new draft, but I do see a potential problem in a quick skim:

protocol Actor : AnyObject, Sendable { }
protocol DataProcessible: Actor { ... }

Doesn't this imply that existentials of DataProcessible will be Sendable and allow clients to poke at sync members? This isn't memory safe.

The solution to this (which seems consistent with your approach) is to untie these protocols from the Actor protocol, and make them be "actor protocol DataProcessible". It is the actor pointers that are sendable, not the existentials of this wierd "actor protocol" thing. Your approach is to provide a "different color to protocols".

Tying this into the AnyActor protocol is also weird for other reasons as mentioned last night: actors have a dual nature, so calling either one of them "the actor protocol behavior" is biasing to one at the expense of the other. I'd recommend using a word like "within" or "nonmailbox" if you were to go with this approach.

In any case, I appreciate the ongoing iteration on this proposal. It is a lot of work, but this is an essential part of the entire design for Swift, so I'm happy to see the iteration.

-Chris

Douglas_Gregor · March 10, 2021, 8:50pm

bjhomer:

For that matter, is there anything unsafe about this?
let balance = await account[keyPath: \.balance]
It seems to me that using a keypath (with thing[keyPath:] ) requires actor isolation, but it's not clear to me from the proposal why simply forming such a key path should require isolation.

The fundamental problem here is that key paths, once formed, can be used from anywhere. SE-0302 specifies that key paths conform to the Sendable protocol (per https://github.com/apple/swift-evolution/blob/main/proposals/0302-concurrent-value-and-concurrent-closures.md#key-path-literals), so they can be copied into another concurrency domain. We could choose differently by not making key paths Sendable, or trying to conditionalize parts of the hierarchy based on the root.

Doug

bjhomer · March 10, 2021, 8:54pm

Sorry, I guess I'm still not seeing it. A key path on its own does not access any data; it just specifies which data to access. Why is it problematic if another concurrency domain has access to the key path, if they never access any isolated data with it?

If the subscript(keyPath:) call is actor-isolated, how does passing around a key path cause problems? Could you give me an example of how it would be problematic?

Douglas_Gregor · March 10, 2021, 9:38pm

This proposal does model it in the language, using the same notion of isolated parameters that is used everywhere else.

It is better because it acknowledges the fundamental nature of actors, that they are synchronous internally but asynchronous externally, and naturally abstracts over different actor types. The fact that this approach falls out of the existing actor semantic model with one generalization---that "anything that conforms to Actor is an actor type"---is a strong indicator that this is the natural semantics.

In contrast, every solution proposed to try to re-use synchronous protocols for actor types is trying to paper over this fundamental nature of actors, and they immediately descend into complicated type systems. Both isolated protocol conformances and sync actor types (also see problems with sync actor types are massive complications to the type system, and we're not even sure that they're sound ones. Sometimes, the inability to express something cleanly in the type system is an indication that you're working against the nature of the language.

Chris_Lattner3:

I haven't had time to do a detailed read through of your new draft, but I do see a potential problem in a quick skim:
protocol Actor : AnyObject, Sendable { }
protocol DataProcessible: Actor { ... }
Doesn't this imply that existentials of DataProcessible will be Sendable and allow clients to poke at sync members? This isn't memory safe.

This is safe, and you are misunderstanding the model. Let's give DataProcessible a synchronous member:

protocol DataProcessible: Actor {
  func f()    // type of this member is (isolated Self) -> () -> Void
}

func g(a: DataProcessible, b: isolated DataProcessible) async {
  a.f() // error: a is not isolated, you have to use async
  b.f() // okay: b is isolated
  await a.f() // okay
}

The key is that the "self" parameter of any instance member in an Actor-derived protocol is isolated, just like the "self" parameter of an instance member of an actor type is isolated. Same principle, minor generalization.

I think the actor protocol vs. : Actor syntax is a distraction here. My approach is to let the instance members of protocols have isolated self parameters.

Doug

Douglas_Gregor · March 10, 2021, 9:42pm

Unclear! We'd need to statically describe the relationship between actors somewhere. The delegateActor notion is one such approach.

Doug

Douglas_Gregor · March 10, 2021, 9:47pm

I made a similar argument in favor of @concurrent over @sendable over in the thread on SE-0302, and Joe Groff made an excellent point:

On the other hand, I think it'd be good fight back against the urge to wordsmith and keep the names aligned. Other unnecessarily different wordings in the language for the same concept, like mutating / inout and consuming / owned for self vs. other arguments, or class -in-protocols-and-classes-but- static -everywhere-else for type-level methods, create barriers for learning the language.

We're not just talking about isolated here. We're talking about isolated, and nonisolated, and all of the terminology around actors isolating their state. We should either revisit it all or leave this one be; it's better than introducing more inconsistency for the sake of a slightly-better meaning for one (relatively obscure) feature.

Doug

michelf · March 10, 2021, 10:06pm

It's different than mutating vs. inout meaning the same thing though. One "isolated" applies to a type while the other "isolated" applies to a method declaration. An "isolated" method means the method needs to be called with await, whereas an "isolated" actor type means method on that value do not need to be called with await. It's incoherent.

I guess the intent is that the two "isolated" negate each other, but I'd rather use a different word for that negation.

~~Using nonisolated instead of isolated on the type would make it more coherent with what it does, in my opinion, assuming we absolutely need to use the same wording at both places.~~

Edit: I might be a bit confused with that last part. It wouldn't really work for self inside an isolated method to be of of type nonisolated Self. Edit 2: or does it not? Once you're inside the function you've basically peeled the isolation layer, right?

abbottmg · March 10, 2021, 10:11pm

This isn't necessarily true. It means caller-synchronous access for a nonisolated data member, but an actor could have the following two function members:

extension A {
  nonisolated func f(t: T) -> R {}
  nonisolated func g(t: U) async -> R {}
}

Maybe f was written to fulfill a synchronous protocol requirement but g still cannot. Perhaps A has some immutable reference to another actor and is simply forwarding it along. Perhaps it's going to bundle some nonisolated data members and send that over to U, which is another actor, then await self.otherIsolatedMethod(results), minimizing the size of the critical section that will eventually need to run isolated on A.

All we really know from g's signature is that it is okay starting execution with the limited interface available "in-sync from anywhere".

Since both isolated and nonisolated functions and methods can also be async, if we try to reduce one of these antonyms down to sync (presumably leaving the other to be implicit), I don't see a way around ending up with one or both of the following:

extension A {
  sync func h() async -> R
}
func m(sync A) async -> R

Which is just another antonym problem, except sync isn't actually an antonym of async, because neither isolated nor nonisolated maps 100% to "will or won't be async".

anandabits · March 10, 2021, 10:39pm

Will this be established in the custom executor proposal when it is updated again? I'd like to see this detail worked out. It isn't clear to me how delegateActor would establish the relationship statically...

Douglas_Gregor · March 10, 2021, 11:22pm

It's the same meaning. "Isolated" means it runs within an actor and has access to that actor's isolated state. An isolated declaration specifies which parameter is the isolated instance---usually self, but a different parameter can be isolated. When you are accessing an isolated declaration, either you need to have an isolated argument to provide to that isolated parameter, or you need to perform asynchronous access.

Doug

Douglas_Gregor · March 10, 2021, 11:29pm

That proposal is going to need revision, and I'd rather dive into the "how" of establishing relationships among different actors in the context of the custom executors proposal. I think there's a bit of exploration to do here.

Doug

anandabits · March 10, 2021, 11:35pm

Makes sense. Do you expect a new draft of that proposal to be available before this proposal is reviewed so we have more context around this issue?