[Pitch] `transferring` isolation regions of parameter and result values

hborla · February 23, 2024, 9:49pm

Hello, Swift evolution!

I wrote up a pitch for extending region isolation to enable explicit transferring annotations on parameter and result types to indicate that argument and result values must be in a disconnected region at the function boundary. This enables controlling which parameter and result values are in a disconnected isolation region, which is applicable to a number of APIs in the Concurrency library and beyond.

You can view the proposal draft on GitHub at swift-evolution/proposals/NNNN-transferring-parameters-and-results.md at transferring-parameters-and-results · hborla/swift-evolution · GitHub

Please leave editorial feedback on the swift-evolution PR at Add draft proposal for transferring parameters and results. by hborla · Pull Request #2339 · apple/swift-evolution · GitHub.

I also plan to include a section on adopting transferring in the Concurrency library, e.g. in the CheckedContinuation APIs mentioned in the motivation section. I'm still working on fleshing out that section.

I welcome your questions, thoughts, and other constructive feedback!

-Holly

Alejandro · February 27, 2024, 1:56am

Awesome! One question I had was how this relates to Mutex. As you know, a mutex's value is in its own isolation domain so transferring makes a lot of sense for the initialization of a mutex to indicate we're moving a value from one domain into the mutex. However, mutex needs to provide exclusive access via an inout to some acquired thread. Does transferring inout Value make sense for the closure based API withLock? The proposal states:

A transferring parameter requires the argument value to be in a disconnected region. At the point of the call, the disconnected region is transferred to the isolation domain of the callee, and cannot be used in the caller's isolation domain after the transfer:

our value should be in a disconnected so that should be fine. Conceptually we'd like to disallow the following the withLock closure:

// where mutex = Mutex<NonSendableReference>
mutex.withLock { ref in
  someCapturedClass.property = ref
}

The above could be allowed if the compiler forced the transferring inout reference closure parameter to be reassigned once the callee essentially copied the reference outside the mutex's disconnected region. This essentially means we've given up the non sendable reference value that mutex was protecting and replacing it with a new thing. It could be the case that transferring as a concept does not make sense for this particular case and we need fully fledged out disconnected regions to model this idea for the closure API.

Curious to hear your thoughts and @Michael_Gottesman's

hborla · March 1, 2024, 8:42pm

I struggled to see what transferring inout would mean because transferring indicates that the callee is free to do whatever it wants with the value, e.g. transfer it away to another isolation domain, but inout indicates that the value is needed again in the caller. Your suggestion here could definitely work:

But I wonder if this rule feels too bespoke and we'd be better off modeling this with disconnected types from the future directions section?

What do others think?

Michael_Gottesman · March 2, 2024, 10:46pm

I thought a little about this in terms of modeling... would love all of your thoughts. I first have a section called the model and then below that I talk about implications in another small section

A Potential Model

Generally the idea with an inout parameter is that it can be viewed as an amalgamation of an @in and an @out parameter... creating our two conditions: the value must be initialized on entrance and on return must be initialized as well... one can reinitialize the inout by moving its contents out, reinitializing, moving out again, etc as long as one has an initialized value of the appropriate type in the inout binding upon function exit.

If one follows that model for transferring inout that would say to me that the inout can not just only have a value at exit, but that value must also be disconnected just like a transferring return value must be. Inside the function one would be able to merge the transferring parameter with actor isolated callees, send it off to other functions, reinitialize it with a newly constructed value (reinitializing it to a disconnected state) as long as one has a disconnected initialized value in the inout binding upon function exit. Implementing this would require a small tweak to the checker to check that a transferring inout value is not actor isolated at function exits.

In contrast, a disconnected inout value would be significantly stricter. Assigning to a disconnected inout value in the function body would act as a transfer to ensure that we preserve the disconnected property. To merge it into an actor region would act as a transfer instead of a merge:

actor MyActor {
  var field: NonSendable
  func passDisconnected(_ x: disconnected inout NonSendable) async {
    let combine = (x, field) // This would have to be a transfer
    x = T() // Reinitialize T since I moved it.
  }
}

Of course upon exit, the disconnected inout value would need to have the same property as a transferring inout... namely that the value in the inout must be in a disconnected region. The difference in between the two comes down to what restrictions do we want to place upon it in the function body.

Interestingly even with transferring inout, using the model I laid out above, one has the property that passing a value as inout transferring ensures that the passed in var is still disconnected in the caller upon function return... even if the inout transferring value is passed into an actor method!:

actor MyActor {
  var field: NonSendable

  func useInOutTransferring(_ x: transferring inout NonSendable) async {
    // Transfer the value into the actor.
    field = x

    // Must reinitialize x with a disconnected value before return.
    x = NonSendable()
  }
}

func performWork() async {
  let a = MyActor()
  let b = OtherActor()
  var x = NonSendable()

  // Without transferring, x would be transferred into actor's region since its value
  // could be placed into the actor storage. One would have to reinitialize it before
  // using it again. But since it is transferring, we can use x later and even transfer
  // x later again.
  await a.useInOutTransferring(&x)
  await b.otherActor(&x)
}

Implications

My main thought here is it depends on what semantics we want to have here. I think both models could work. I haven't thought about which one would be the most ergonomic/etc (just trying to explore the state space) or if the state space is larger than what I put out here.

Nickolas_Pohilets · March 3, 2024, 12:58pm

This paragraph is confusing:

At the point of the call, the disconnected region is transferred to the isolation domain of the callee, and cannot be used in the caller's isolation domain after the transfer:

I think at the point of the call disconnected region should be transferred to a fresh opaque isolation domain, which is conservatively considered be distinct from all other domains.

This probably was not intended, just a matter of wording, but current phrasing looks as if the following is possible:

func mergeRegions(_ ns1: NonSendable, ns2: NonSendable) {}

@MainActor
func acceptTransfer(_: transferring NonSendable) {}

@MainActor
func keepUsing(_ ns: NonSendable) {}

func transferToMain() async {
  let ns1 = NonSendable()
  let ns2 = NonSendable()
  // [(ns1), (ns2)]
  mergeRegions(ns1, ns2)
  // [(ns1, ns2)]
  await acceptTransfer(ns1)
  // Is it now [(ns1, ns2, @MainActor)] or [(ns1, ns2, α)]?
  await keepUsing(ns2) // ok or error?
}

If function has several transferring parameters, are they all assumed to be in the distinct isolation regions?

Is transferring part of the type? Given

func g<T>(_ x:  T, _ f: (T) -> T) { f(x) }
func h(_ x: transferring NonSendable) -> transferring NonSendable { x }
let y = g(NonSendable(), f)

what would be region of the y?

Nickolas_Pohilets · March 3, 2024, 1:08pm

I understand the idea for transferring inout, it makes sense to me, and I see demand for it, but see no difference between transferring inout and disconnected inout.

Both let combine = (x, field) and field = x have the same effect of merging region of x with region of field, and force x to be reinitialised before exit. Both examples look like the same thing to me. Do I miss something? Could you please elaborate on the difference between transferring inout and disconnected inout?

Michael_Gottesman · March 4, 2024, 1:36am

The main difference is that transferring inout in the body of the function is allowed to be merged into an actor isolation domain. A disconnected inout is not allowed to do that so we must transfer the value from the inout as part of putting a value into another region.

Consider the following:

actor MyActor {
  var ns = NonSendable()
  mutating func test(_ d: disconnected inout NonSendable) async {
    // Once we assign d into ns, we have transferred d's value into MyActor's isolation
    // domain...
    ns = d

    // So at this point, if we were to try to use d in any way
    // we would get an error since d was transferred into MyActor
    // and the value in d is no longer disconnected (the invariant that
    // we are attempting to enforce.
    use(d) // Error!

    // If we reinitialize d though with a disconnected value, we can use it again.
    d = NonSendable()
    use(d) // Ok!
  }

  mutating func test(_ t: transferring inout NonSendable) async {
    // In contrast, when we assign t into ns...
    ns = t
    // We can use t here safely since it is safe to use values from
    // MyActor's isolation domain here.
    use(t)
  }
}

Nickolas_Pohilets · March 4, 2024, 9:22am

Why is it an error in the first example? Does use() require its argument to be disconnected as well? If it does, why is use(t) ok?

Michael_Gottesman · March 5, 2024, 12:10am

In the first example, we are assigning the disconnected variable into the actor isolated region. This creates a predicament in the language since a disconnected variable is a variable that must always remain in a disconnected region. This means that unlike other normal variables that are in a disconnected region (perhaps due to just being constructed), we cannot just merge such a disconnected variable into an actor isolated region. To solve our predicament, off the top of my head there are only two options:

We could emit an error saying that one cannot use a disconnected variable in this way.
We could transfer the disconnected value. This makes the disconnected value unavailable in the current task and ensures that one cannot use it again until it is reinitialized with a new disconnected value. The only exception to this is that if a disconnected value is used by an async let, it can become available again (untransferred is what I call it) after the async let value is invoked. This models (to me) nicely that we are moving all values in the disconnected region into the other region.

use is intended to be a nonisolated use. Since the use is nonisolated, there is no way for the disconnected value to be transferred (since function parameters cannot be further transferred since they could have uses in the function caller) which implies that use's parameter will always be disconnected. This implies that our invariant of the value always being disconnected must be preserved by the function.

Nickolas_Pohilets · March 5, 2024, 3:16pm

If region of the value is not allowed to connect to actors (1) - that’s already how non-transferring parameters work. But they do this because region may be connected in the caller. So callee cannot know that connecting them to an actor is safe. The disconnected keyword maintenance the same restrictions for the caller, but now adds information that actually region is not connected. I don’t see how callee can make any use of this information. If this extra information is not helpful for the callee, then existing non-transferring params are sufficient to model this case.

In (2), if I understood you correctly, you are describing exactly what transferring does. After transferring to the actor region value is not accessible to the task-isolated code, but the test function is actor-isolated, so it may continue using the value. Once variable is reassigned a new value in disconnected region, it becomes accessible to the task-isolated code. And requirement to reassign the value before leaving the function comes from combination of transferring and inout.

actor MyActor {
  var ns = NonSendable()

  nonisolated func use1(_ x: NonSendable) async {}
  nonisolated func use2(_ x: NonSendable) {}
  func use3(_ x: NonSendanle) async {}

  func test(_ t: transferring inout NonSendable) async {
    // [(t), {self}]
    ns = t
    // [{t, self}]

    await use1(t) // error
    use2(t) // ok
    use3(t) // ok

    t = NonSendable()
    // [(t), {self}]
    await use1(t) // ok
    use2(t) // ok
    // [(t), {self}]
    use3(t) // ok
    // [{t, self}]
    await use1(t) // error

    t = NonSendable()
  }
}

John_McCall · March 8, 2024, 1:18am

I've been thinking about the spelling for this feature, and as part of that, I've had this sense that we're not able to express all the thing we need to express. I often find that building a formalism for a feature helps me explore the possibilities better and have confidence that I've done so exhaustively. I talked through this with @hborla, and I think we've come up with a formalization that I feel fairly good about. This isn't a concrete suggestion at this time, but I think it's a good starting point.

Here's the key idea. We know from region isolation that every non-Sendable value is in a specific region. We can make those regions explicit in a type signature. So let's say that 'r is a region qualifier which identifies a notional region r. Saying that a value has type 'r T means that it has type T and is in region r. Every non-Sendable type carries a region qualifier this way.

We can use that to understand the existing region-isolation rules:

Converting 'r1 T to 'r2 T merges region r1 with region r2; this requires at least one of the regions to be mergeable.
A non-isolated function with non-Sendable parameters or results is polymorphic over a single region qualifier, and all the types are qualified with that:
```
// <'r> ('r A, 'r B) -> 'r C
func foo(x: A, y: B) -> C
```
Within the function, r is not a mergeable region. The caller effectively merges all of the regions of the arguments and then expects the result to be in that merged function.
An isolated function with non-Sendable parameters or results is not polymorphic over a region. Instead, the types are all qualified with the region corresponding to the function's isolation:
```
// ('MainActor A, 'MainActor B) -> 'MainActor C
@MainActor func bar(x: A, y: B) -> C
```
These innate actor-isolated regions are never mergeable.
As a special case, a function that returns a non-Sendable value but doesn't have access to any other regions — it's Sendable itself, it's non-isolated, and it has no non-Sendable results — is assumed to return a disconnected value:
```
// () -> 'disconnected C
func baz() -> C
```
You can consume a 'disconnected C to "open the box" and get a value of a new mergeable region, as long as you only actually try to merge it once; copying the value or opening it multiple times downgrades the region to an opaque, non-mergable region.

Okay, so we've formalized the rules for how we interpret function signatures under region isolation. What could we express in the formalism that that interpretation doesn't cover? I think it's these things:

'disconnected C cannot be a value parameter.
disconnected C cannot be the type of a stored property.
'disconnected C cannot be an inout parameter.
'disconnected C cannot be the result of function with access to other regions, such as an isolated function or a function with non-Sendable parameters.
'disconnected C cannot appear in other nested positions.
Functions can have at most one region-qualifier parameter.
Other region-qualified types cannot appear in other nested positions.

Many of these seem quite important:

Allowing a disconnected value argument lets you safely and repeatedly transfer non-Sendable values between concurrent contexts, which unlocks a lot of basic expressivity with regions.
Allowing a disconnected stored property lets you safely store non-Sendable values and then transfer them later. Note, however, that the containing type would have to be non-Copyable.
Allowing a disconnected inout argument lets you safely abstract over a disconnected stored property, which is very important for types like mutexes with a need to guard access to the actual storage.
Allowing functions with access to other regions to return a disconnected value lets you safely express APIs that e.g. do a deep copy of a non-Sendable value as long as no non-Sendable data flows into the result.
Allowing functions to have multiple region-qualifier parameters (<'r1, 'r2>) lets you build abstractions that work with values in multiple regions without forcing them to be merged, preserving the ability to transfer those regions independently later.

Allowing 'disconnected T to appear in arbitrary positions (say, as the generic argument) poses some interesting questions about how exactly it works in the language. I've talked about the core operations of boxing and opening a disconnected value, but we really want those to be implicit in source. If disconnected is just a builtin modifier, then it's straightforward to remember it in the conventions around (say) parameter-passing, and then region-based isolation can use that information. If it's actually a type, then those operations are implicit conversions, which creates a number of novel challenges. I think it would be sufficient to have a standard library Disconnected<T> type that could be used in these nested positions when necessary.

Allowing region-qualified types in general to appear in arbitrary positions would let you write things like a function that takes a function that's known to return a value in the same region as another argument. In practice, the most useful examples of this are likely to be expressible in other ways (such as just using the default rule), and it would create a significant syntactic mess.