SE-0420: Inheritance of actor isolation

xwu · January 26, 2024, 2:40am

Hello Swift community,

The review of SE-0420: Inheritance of actor isolation begins now and runs through February 6, 2024.

Reviews are an important part of the Swift evolution process. All review feedback should be either on this forum thread or, if you would like to keep your feedback private, directly to me as the review manager via the forum messaging feature. When contacting the review manager directly, please put "SE-0420" in the subject line.

Trying it out

If you'd like to try this proposal out, you can download a toolchain supporting it and enable the experimental feature flag OptionalIsolatedParameters:

What goes into a review?

The goal of the review process is to improve the proposal under review through constructive criticism and, eventually, determine the direction of Swift. When writing your review, here are some questions you might want to answer in your review:

What is your evaluation of the proposal?
Is the problem being addressed significant enough to warrant a change to Swift?
Does this proposal fit well with the feel and direction of Swift?
If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?
How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

More information about the Swift evolution process is available at

https://github.com/apple/swift-evolution/blob/main/process.md

Thank you,

Xiaodi Wu
Review Manager

tgoyne · January 26, 2024, 5:14am

The proposal sounds like it is exactly the thing we have wanted. I attempted to test it to verify and ran into a compiler crash before even attempting to use the new functionality.

mattie · January 26, 2024, 5:25pm

This proposal directly addresses a problem that I have struggled with frequently across many code bases. I agree that the syntax is a little unwieldy, but I'll take it. I cannot wait for this!

I have no doubt that lots of thought and debate has gone into this. But, I just want to comment about the "Future directions" section. It discusses a possible easy opt-in. I'll take that too! But are you sure that shouldn't be an opt-out instead? I think that would make things a lot more intuitive for many, but I'll admit that I do not fully understand the implications.

tgoyne · January 26, 2024, 7:09pm

func foo(isolation: (any Actor)? = #isolation) async is giving me the error "Non-built-in macro cannot be used as default argument". Is there another feature flag required in addition to OptionalIsolatedParameters or am I just doing something wrong?

kabiroberai · January 26, 2024, 8:51pm

Major +1, always a fan of proposals that enable greater expressiveness at compile-time. I have a request for a tweak in the semantics of this proposal, but first a motivating use case:

Motivation

I've found Concurrency to be massively useful in expressing API contracts for my NodeJS-Swift bridge:

Specifically, the above bridge is built on top of Node-API. Node allows you to have multiple "NodeJS threads" running in a process, each corresponding to a single instance of the Node interpreter. Any Node-API call must occur on the appropriate Node thread. I'm currently expressing this with a @globalActor actor NodeActor which has a custom executor that dispatches to the right Node thread. This allows for APIs along the lines of

final class NodeNumber: Sendable {
  ...
  @NodeActor func doubleValue() -> Double
}

Unfortunately this doesn't have quite the right semantics, since

It's non-trivial to determine which NodeJS thread the executor needs to dispatch a call to. Currently we're using a heuristic for this, namely a task-local that stores the "active" Node thread. But this is unreliable and is a source of Spooky Action at a Distance.
A "global" NodeActor implies that all calls are serialized, even those occurring on different Node interpreter threads.

This proposal would pave the way to alleviate both of these problems. I would love to drop the @globalActor annotation on NodeActor and instead declare APIs like

final class NodeNumber: Sendable {
  ...
  func doubleValue(in context: NodeActor = #isolation)
}

This would solve both of the aforementioned issues, while improving performance at the same time due to 1) being able to run jobs on different NodeActors in parallel and 2) not having to reach into a global/TaskLocal to determine the isolation context.

Suggestion

From what I can tell, the #isolation parameter can only be used with the static type (any Actor)? — even though the macro is able to resolve to a more specific type per this test.

It would be ideal if it were possible to narrow the #isolation parameter to a more specific type, such that if the user weren't in the expected context it would simply lead to a substitution failure. This would be great for the above use case, and IMO really any model that involves a thread pool. That request aside, I'm excited to see this proposal land!

xwu · January 26, 2024, 9:20pm

My understanding is that this part of the proposal (i.e., default arguments) is missing from the toolchain implementation.

hborla · January 26, 2024, 9:20pm

No, you're not doing anything wrong; the implementation doesn't support using #isolation as a default argument right now. This will fallout from the implementation of Expression macro as caller side default argument by ApolloZhu · Pull Request #2279 · apple/swift-evolution · GitHub if accepted so I figured I'd wait, but I can implement the special case for #isolation if it's useful to verify that the proposed behavior satisfies real world use cases.

tgoyne · January 27, 2024, 12:15am

The concrete things I'm hoping this proposal would help with are:

1: Replace _unsafeInheritExecutor to enable copying non-sendable data before leaving the caller's executor:

// before
@_unsafeInheritExecutor
public func foo(value: Nonsendable) async {
    let sendable = value.sendableCopy()
    await foo(sendableValue: sendable)
}

// after
public func foo(value: Nonsendable, _isolated: (any Actor)? = #isolated) async {
    let sendable = value.sendableCopy()
    await foo(sendableValue: sendable)
}

The fact that actor isolation is passed as a parameter is awkward here as no argument other than what #isolated produces can ever be valid. The only idea I have for improving this would be a way to mark a parameter as default-only and not explicitly settable, which could potentially be useful for other expression macros as well. This would be a purely additive thing so it could be done later if people do turn out to be often confused by this.

With the current toolchain this produces warnings when I try to use it:

public class Nonsendable {}
public func foo(value: Nonsendable, _isolated: isolated (any Actor)? /*= #isolated*/) async -> Nonsendable {
    value
}

// Non-sendable type 'Nonsendable' returned by implicitly asynchronous call to actor-isolated function cannot cross actor boundary
// Passing argument of non-sendable type 'Nonsendable' into actor-isolated context may introduce data races
@MainActor func fromMainActor() async -> Nonsendable {
    await foo(value: Nonsendable(), _isolated: MainActor.shared)
}
// no warnings
func fromIsolated(isolated: isolated (any Actor)?) async -> Nonsendable {
    await foo(value: Nonsendable(), _isolated: isolated)
}
// same warnings as fromMainActor
func fromNonisolated() async -> Nonsendable {
    await foo(value: Nonsendable(), _isolated: nil)
}

My understanding is that none of these actually cross an actor boundary, since foo is always inheriting the caller's actor. Is this just a temporary shortcoming of the current implementation, something that requires #isolated, or something that's actually expected not to work?

2: Creating object instances isolated to the current actor:

Realm objects need to be confined to some sort of isolation context which ensures that the objects will only be used from one thread at a time and allows us to schedule work to be performed in the isolation context from outside of it. Actors satisfy both of these, so we have:

struct Realm { // not Sendable
  init<A: Actor>(configuration: Realm.Configuration, actor: A) async throws
}

Since it's non-Sendable, normal sendability checking ensures it doesn't leave the isolation context it's created in (until RBI breaks that?), and the actor instance is stored and used to schedule work inside the isolation context. The problems is this API are that it's a bit clunky (since you have to explicitly pass in the current actor), and more importantly it kinda doesn't work. The actor parameter isn't isolated because isolated parameters for initializers were never implemented, which is something I discovered while first implementing this and then apparently completely forgot about without ever filing a bug report. The actual implementation immediately dispatches to a function which does have an isolated parameter, and this whole thing produces several sendability warnings.

From reading the proposal I think I should be able to do init(configuration: Realm.Configuration, actor: any Actor = #isolated) async throws and everything will work exactly as desired.

3: Passing isolation parameters to obj-c async functions

We have a Swift wrapper around a c++/obj-c library, and so some of the Swift async functions end up making called to bridged obj-c functions. In Swift 5.9 these started producing sendability warnings which we made go away by marking the obj-c functions with __attribute__((swift_attr("@_unsafeInheritExecutor"))) (which I somewhat suspect may not be something that intentionally works).

If it's possible to implement it'd be nice to have something along the lines of __attribute__((swift_isolated(1))) to indicate that the first parameter is an isolated actor parameter.

My other thought on this is that withCheckedContinuation() should use #isolated. We switched from it to using obj-c async function bridging because on the backdeployed runtime it failed to inherit the caller's executor and so didn't call the block synchronously on the caller's thread as the documentation claims it does. Switching from using _unsafeInheritExecutor to an official language feature for it should make it more reliable and less weird?

hborla · January 27, 2024, 1:27am

I think this excerpt from the motivation of SE-0338: Clarify the Execution of Non-Actor-Isolated Async Functions is still relevant justification for making inheriting actor isolation an opt-in behavior rather than opt-out:

It can lead to unexpected "overhang", where an actor's executor continues to be tied up long after it was last truly needed. An actor's executor can be surprisingly inherited by tasks created during this overhang, leading to unnecessary serialization and contention for the actor.

I also think the ramifications to making such a change in an existing ecosystem of async code probably makes inheriting isolation by default a nonstarter. Safely inheriting actor isolation is an ABI change, because the function needs to accept the actor as a parameter.

kabiroberai · January 27, 2024, 1:30am

FYI the fact that #isolation produces an optional Actor, i.e. (any Actor)?, means that this may not work the way you hope. This is another situation where the ability to produce a subtype from #isolation (cf my suggestion) would be really powerful. Add SFINAE-style overloading to that and you could achieve something like

struct Realm {
  // guaranteed to be isolated
  init<A: Actor>(configuration: Realm.Configuration, actor: A = #isolation) async throws

  // overload selected if we can't match the previous overload,
  // i.e. if actor == nil
  @available(*, unavailable, message: "Realm.init must be called from an actor.")
  @_disfavoredOverload
  init(configuration: Realm.Configuration, actor: (any Actor)? = #isolation) async throws
}

tgoyne · January 27, 2024, 1:35am

My reading of the proposal is that #isolation can be used with a non-optional parameter and it simply results in a compilation error if it expands to nil.

kabiroberai · January 27, 2024, 1:39am

Doesn't appear to pass type-checking unfortunately

func doSomething(with act: isolated any Actor) {}

@MainActor func testSomething() {
    doSomething(with: #isolation)
    // ^ error: Value of optional type '(any Actor)?' must be unwrapped to a value of type 'any Actor'
}

hborla · January 27, 2024, 1:42am

The implementation is not perfect! The proposal is correct:

The type of #isolation depends on the type annotation provided in the context of the expression, similar to other builtin macros such as #file and #line, with a default type of (any Actor)? if no contextual type is provided. When type-checking considers a candidate function for a call that would use #isolation as an argument for a parameter, it assumes that the notional argument expression above can be coerced to the parameter type. If the call is actually resolved to use that candidate, the coercion must succeed or the call is ill-formed. This rule is necessary in order to avoid the need to decide the isolation of the calling context before resolving calls from it.

This matches the behavior of other builtin macros, where the type depends on the type annotation you provide.

ktoso · January 27, 2024, 1:44am

We should probably clarify if both those could be supported:

func k(isolatedTo actor: any Actor = #isolated)
func k(isolatedTo actor: any Worker = #isolated)

The latter is quite interesting actually. I wonder if we could put this to use in APIs dedicated to not escaping values out of some actor hm.

kabiroberai · January 27, 2024, 1:45am

aha, I must have missed that. Sounds like exactly what I was hoping for. Thanks for clearing that up!

John_McCall · January 27, 2024, 6:15pm

I’d like to understand your concern about region-based isolation better in case there’s something we need to do. Is there some sort of implicit thread/task-local sharing such that these values really just can never be safely moved out of the concurrent context they’re created in?

If this needs a really elaborate answer, we should probably take it to another thread.

tgoyne · January 27, 2024, 7:09pm

Currently yes, there's a few things that tie them to a specific context.

We cache database handles per context so that all reads within a single context see a consistent view of the database. As long as you don't mix different kinds of confinement (we support thread confinement, GCD queue confinement, and actor confinement), two different database-backed objects from different sources will always share the same read transaction if they are both legal to use. I have mixed feelings on if this was a good decision, but it's very baked into how people use the API at this point. Transferring objects between contexts thus requires some sort of active step that interacts with the handle cache.

Read transactions are automatically refreshed when write transactions are committed on other threads (or processes, etc.). This was originally done by posting messages onto runloops, and with actor-confined Realms is done by invoking a function which is isolated to that actor. It's important that this be done in a push manner rather than something like checking if the transaction is stale and refreshing it if so on each read for two reasons. The first is that it means that implicit refreshes never happen in the middle of a synchronous function. The data you read may be different in different callouts from the runloop, but within a single event handler you have a transactionally isolated view of the database even if writes are happening concurrently. In the case of async functions implicit refreshes can happen at each await which is a bit less likely to coincidentally give correct results for someone who isn't thinking about the problem of consistent reads, but does still give useful guarantees. Secondly, it's important that we refresh read transactions even if the developer is just holding onto a transaction and not using it. Read transactions pin all of the data they require and don't allow us to overwrite it, so holding old read transactions open result in the database file growing over time.

Finally, we have change notifications which are by default non-Sendable callbacks invoked within the context which the Realm instance was constructed in.

Having said all that, it seems like the simple fact that the Realm struct holds a pointer to a non-sendable obj-c object (or in the future possibly a non-sendable c++ object instead) should be reason enough for RBI to be unable to prove anything about the scope of objects and not actually change anything.

John_McCall · January 28, 2024, 12:55am

Yeah, region-based isolation will always enforce (a conservative version of) the non-Sendable data flow, so if you're storing a common non-Sendable value in two different values, there should never be any situation in which region-based isolation analysis can decide that those values are in different regions.

In your example, as I understand it, the user has this non-Sendable database handle, either directly or in some other non-Sendable value, and they pass it to a function that gives them a Realm back.

The default region assumption for functions is that the regions for all the non-Sendable parameters are now interconnected and must be thought of as the same region, and any non-Sendable values in the result are part of that same region. That's what the region-based isolation analysis will enforce on both sides of the call. So in the caller, that conservative assumption is strict: if the callee tries to transfer the Realm value away by itself, that's ill-formed, because that value is still associated with the region that the handle came from. In the callee, of course, the conservative assumption is lax: the region analysis will see that non-Sendable data (the database handle) is indeed flowing from the parameter to the result, and it'll say that's fine because they're in the same region.

Eventually, you'll be able to use annotations to override that default assumption, e.g. to say that the function returns a Realm in a disconnected region. If you do that, things will get more lax in the caller, and users will now be able to transfer the Realm away by itself. In your case, of course, that's bad and not something you'd want to do. But that laxness in the caller turns into strictness for the callee: when you type-check the function itself, the compiler will see the non-Sendable data flow from the parameter into the supposedly-disconnected result, and it'll say that that's a region violation. So everything's still sound overall; messing up an annotation (at least in pure Swift code) doesn't by itself admit a million race conditions.

The only places where region-based isolation can get you into new trouble are (1) annotations that the compiler just has to accept without enforcement, like on ObjC methods, and (2) things that violate the basic data flow assumptions, like if non-Sendable values can get smuggled around in invisible ways. For example, if you have one C function that takes a non-Sendable value and stashes it in something like a thread-local variable, and then another C function that reads that value out and returns it, those functions are going to completely mess up region-based analysis without the right annotations. But anything that just comes down to normal local data flow should never be a soundness hole. And of course C/C++ thread-local variables were already a way to violate sendability rules even without region-based isolation, because many concurrent contexts in Swift can share and/or switch threads.

I hope that helps address your concern. This is probably as much as we should put in this review thread, though, so if you have further questions, we should really fork off a new thread to talk about it.

robert.ryan · February 2, 2024, 10:12pm

I appreciate (and have struggled with) the problem that SE-0420 is trying to solve. This is a problem that deserves our attention. However, I have reservations regarding the proposed solution.

It seems vaguely dissatisfying to change the concurrency context of a function based upon the presence of an isolated parameter (especially one that you never actually use in the called function). I can imagine why it was implemented this way, but it seems like the function, itself, deserves some qualifier to designate that its context will be dictated by some new rule.

Maybe we reconsider one of the “Future Directions” contemplated in SE-0338 (e.g., reasync).

John_McCall · February 2, 2024, 10:23pm

The isolated parameter modifier was added in SE-0313. I agree that it's a little unfortunate that the modifier is applied to a specific parameter rather than to the function as a whole, but I think we can usefully separate these concerns: we can generalize the modifier in a few ways in this proposal, and we can consider later whether we ought to alter the syntax to move it to a leading, function-wide position.