[Pitch] Region Based Isolation

Hey everyone!

Attached is a pitch for a new language feature called Region Based Isolation. It is based off of the earlier pitched SendNonSendable proposal. Would love everyone's thoughts and feedback. Thank you for your time and thoughts!


Abstract

Swift Concurrency assigns values to isolation domains determined by actor and
task boundaries. Code running in distinct isolation domains can execute
concurrently, and Sendable checking defines away concurrent access to
shared mutable state by preventing non-Sendable values from being passed
across isolation boundaries full stop. In practice, this is a significant
semantic restriction, because it forbids natural programming patterns that are
free of data races.

In this document, we propose loosening these rules by introducing a
new control flow sensitive diagnostic that determines whether a non-Sendable
value can safely be transferred over an isolation boundary. This is done by
introducing the concept of isolation regions that allows the compiler to
reason conservatively if two values can affect each other. Through the usage of
isolation regions, the language can prove that transferring a non-Sendable
value over an isolation boundary cannot result in races because the value (and
any other value that might reference it) is not used in the caller after the
point of transfer.

NOTE: While this proposal contains rigorous details that enable the compiler to
prove the absence of data races, programmers will not have to reason about
regions. The compiler will allow transfers of non-Sendable values between
isolation domains where it can prove they are safe and will emit diagnostics
when it cannot at potential concurrent access points so that programmers don't
have to reason through the data flow themselves.

29 Likes

Throughout the text, the proposed rules are illustrated with nonsendable reference types. Supposing I have a nonsendable value type, though, is it required that x and y share an isolation region when they are independent values?

Concretely, given the following—

func assigningIsolationDomainsToIsolationRegions() async {
  // Regions: []
  let x = NonSendable()
  // Regions: [(x)]
  let y = x
  // Regions: [(x, y)]
  await transferToMainActor(x)
  // Regions: [{(x, y), @MainActor}]
  print(y) // Error!
}

...if NonSendable() is a struct, then let y = x would behave differently from let y = NonSendable() even as both statements create (notionally) independent values.

It is required for x and y to be in the same region.

The reason why is that there must be some non-Sendable field in the struct otherwise it would be Sendable. Since in both separate values one conservatively could access that same non-Sendable field, the two values have to be in the same region. As an example, consider the following:

class NonSendable {}
struct ContainsNonSendable {
  var field = NonSendable()
}

func example() {
  let x = ContainsNonSendable()
  let y = x
  // y.field aliases x.field
}

If it were not so, I could transfer x and still access x's field's non-Sendable state via y which would be a race.

3 Likes

This looks great! Just a couple comments on the proposal text:

  • When discussing the weak transfer convention, I got confused and stuck for a minute with the nonisolated function example, because I couldn't see how it mattered what the regions were while suspended in an await. I think pedagogically, it would flow better if the async let example was before this one, where it matters that the region is associated with the task because of the gap between the async let and the await. Then the nonisolated function example trivially follows.

  • Under isolated closures: "In the future, we may be able to accept this code in the future"

1 Like

+1 This seems like magic yet you managed to explain it so clearly that I think I actually understood it all.

This seems like a hugely positive step to making it easier to use swift async. Although, I imagine it might be difficult to reason about why you hit issues sometimes without understanding this document.

5 Likes

Overall really happy to see this move forward. This will definitely remove a bunch of workarounds where we are currently relying on @unchecked Sendable to transfer a value from one isolation region to another and the compiler is not able to diagnose this correctly right now.

I have one open use-case in mind that I think will be working as well with this proposal but I just want to make sure it does.

actor Foo {
  func run() async {
    let stream = AsyncStream<Void>.makeStream().stream
    var iterator = stream.makeAsyncIterator()

    await iterator.next() // This currently produces a Sendable warning in strict mode
  }
}

While the iterator here is transferred across isolation regions due to the law of exclusivity there should not be any concurrent access to the iterator since the next method us mutating.

3 Likes

Re: disconnected fields and the disconnect operator

I assume the update of the disconnected state (x in the examples) must not only happen along all control flow paths but in order to be safe it must also happen before any potential suspension point (an await in the following code). Otherwise the actor could be re-entered and the old x accessed by some other task. Is that correct? If so, that should probably be explicitly stated.

1 Like

Your contrived example will technically compile under this proposal, but remember that iterator is in the same region as stream and all of its results that are non-Sendable; if you do anything to merge the stream or the results into the actor's region -- such as using a parameter in the expression that initializes stream or storing one of the results into a variable that's already in the actor's region -- then the compiler must assume that the actor can reference iterator. In order to fully solve the Sendable issues with AsyncSequence iterators, I believe we need the future direction of this proposal to specify that a result value is disconnected, and then annotate both makeIterator() and next() with that attribute.

Yes, that's correct!

4 Likes

The overall, proposal makes sense, but I think too much of the behavior is implicit. I think by taking advantage of the consume keyword everything could be made more explicit and potentially easier for the compiler to check and easier for humans to read.

The proposal is already using similar semantics as a “move”. Let me illustrate with your own examples modified:

Here’s the motivating example:

// Not Sendable
class Client {
  init(name: String, initialBalance: Double) { ... }
}

actor ClientStore {
  var clients: [Client] = []

  static let shared = ClientStore()

  func addClient(_ c: Client) {
    clients.append(c)
  }
}

func openNewAccount(name: String, initialBalance: Double) async {
  let client = Client(name: name, initialBalance: initialBalance)
  await ClientStore.shared.addClient(client) // Error! 'Client' is non-`Sendable`!
}

Instead of this pattern working as-is, I propose the following modification be required to make this example work:

func openNewAccount(name: String, initialBalance: Double) async {
  let client = Client(name: name, initialBalance: initialBalance)
  await ClientStore.shared.addClient(consume client)
}

This has the same result as your proposal, but the fact that the reference has been transferred over to ClientStore is explicit with the consume keyword. Further, the compiler will already ensure that the reference cannot by used within this scope any longer. The reference has “moved”.

func openNewAccount(name: String, initialBalance: Double) async {
  let client = Client(name: name, initialBalance: initialBalance)
  await ClientStore.shared.addClient(consume client)
  client.logToAuditStream() // Error! `client` has moved and cannot be used.
}

The consume keyword can keep solving these problems. For example, the rule about references that can be sent across isolation domains must contain properties that are either Sendable (the current requirement) or consumed.

let john = Client(name: "John", initialBalance: 0)
let joanna = Client(name: "Joanna", initialBalance: 0)

await ClientStore.shared.addClient(consume john) // OK
await ClientStore.shared.addClient(consume joanna) // OK

Here, the two references are independent and can be consumed and sent to the actor independently.

However, for the “friend” case:

let john = Client(name: "John", initialBalance: 0)
let joanna = Client(name: "Joanna", initialBalance: 0)

john.friend = joanna // (1)

await ClientStore.shared.addClient(consume john) // Error: `john.friend` must consume!

Can only work when you do this:

class Client {
  // …
  private var _friend: Client?;
  var friend: Client? {
    // All properties must consume if we want to consume `Client` itself
    // for sending across domains.
    set { (newFriend: consume Client) in
      self._friend = newFriend
    }
  }
}

let john = Client(name: "John", initialBalance: 0)
let joanna = Client(name: "Joanna", initialBalance: 0)

john.friend = consume joanna // OK!

await ClientStore.shared.addClient(consume john) // OK!
await ClientStore.shared.addClient(consume joanna) // ERROR! `joanna` no longer exists.

I like this modification as it leans on functionality that has already been added to Swift (consume) and doesn’t introduce a brand new concept to understand, but it works pretty much the same as your original proposal.

Finally, let me provide one additional example for when client is not created in the local scope:

func openNewAccount(client: Client) async {
  await ClientStore.shared.addClient(consume client) // Error! 'Client' is not owned
}

can be fixed by doing this:

func openNewAccount(client: consume Client) async {
  await ClientStore.shared.addClient(consume client) 
  // OK! if Client properties are all sendable or `consume`ing types.
}

Update:

My proposed solution cannot work with function arguments like in last example. This is because of how consume works for class types.

This is from the consume keyword examples:

func useX(_ x: SomeClassType) -> () {}

func f() {
  let x = ...
  useX(x)
  let other = x
  _ = consume x
  useX(consume other)
  useX(other) // error: 'other' used after being consumed
  useX(x) // error: 'x' used after being consumed
}

Sadly, consume doesn’t not ensure you have a unique reference to a class type.
Sadly, this raises the question of whether any non-Sendable class types can considered safe to send across isloation domains.

What if the initializer itself, stores a reference every time a new instance is constructed.


// Not Sendable
class Client {
  static var instances: [Client] = []
  init(name: String, initialBalance: Double) { 
    self.name = name;
    self.balance = initialBalance;
    Client.instances.append(self); // Uh Oh! 
    // This makes this class unsafe to ever transfer across isolation domains.
  }
}
4 Likes

There are a number of issues with reusing consume. The largest is that consume is not region aware. Consider the following assuming that we just required consume:

func nonRegionIsolated() {
  let x = NonSendable()
  let y = x
  await ClientStore.shared.addClient(consume x)
  useValue(y)
}

in this case, when I "consume" x, I am just consuming the binding of x. Since y was assigned x's value before addClient was called, it is safe to use y. In contrast, when one transfers the entire region cannot be used. So by transferring x, I would also be transferring y preventing it from being used later in the function.

It is safe to pass any non-Sendable class types across isolation boundaries... it is safe as long as the class and its entire region are passed together. This means the class, any bindings that reference the class, and fields of the class, etc. This prevents any part of the region from being accessible in the original isolation domain and thus eliminates the possibility of races.

In the example that you posted above, even with region based isolation one would receive the following warning:

example.swift:7:12: warning: reference to static property 'instances' is not concurrency-safe because it involves shared mutable state
    Client.instances.append(self); // Uh Oh! 
           ^
example.swift:3:14: note: static property declared here
  static var instances: [Client] = []
             ^

which will be an error in swift 6 mode. The issue is that one is escaping the actor instance region into a non isolated Sendable global.

3 Likes

Yes, I had some misconceptions about how consume works with class types and so a lot of my explanations are flawed.

I think classes are deficiency in the ownership system and there should be a way to clearly declare when you have an exclusive reference to a class. I think the correct solution is to still have explicit keywords like in my example. We might need different keywords since consume doesn’t work.

My only concern with the original proposal is that it’s too implicit. The semantics work fine.

1 Like

Even though consume didn’t work, I agree with @nmn that this does feel very magical. I could see it going two ways:

  1. Everything “just works” and we all never think about this again.
  2. Every once in a while, you hit something confusing that the compiler can reason about but we cannot because it’s say 4 levels deep into isolation. The example I think of is tweaking something somewhere and a “consume” happens somewhere down the line and you don’t understand why it’s breaking.

My worry about #2 makes me want there to be a keyword like “await” but maybe that’s unfounded? Maybe since the consuming has to happen in the same function the only time it would be confusing in the example above would be if you had a giant function?

Just to follow up.

I disagree that this is implicit since the APIs in question that can transfer are already explicitly annotated as being a point of concurrency via async , await , or Task. So one already will have an explicit marking in the source that tells one that a transfer /could/ occur here.

The only information that requiring an additional explicit marker would provide the user is that the programmer can know without reading the API surface that a transfer /will/ occur here, information that can also be ascertained by just reading the source. So in the big picture, when one considers that one can gather the information, there is no possibility of an implicit surprise (since one has an explicit marker), combined with the large annotation burden of having to mark /all/ of the relevant transferring call sites (which would be a source break I think), being forced to annotate fails to pass muster in my opinion.

Something I find myself often wanting to do is provide a guarantee that a non-Sendable type is isolated to some actor, but it doesn't actually matter which. Today, this is very easily and conveniently achieved by just slapping a global actor on the type. Defining a whole new global actor rarely feels right, especially for a library so I just sigh, use MainActor, and move on.

@MainActor // <- I don't want to do this but all other options feel worse
class MyClass {
	private var internalState = 0

	func doSomeStuff() {
		// must be on creating actor here, because this is a non-Sendable type
		Task {
			let value = await someAsyncFunction()

			// now can capture self safely here
			self.internalState += value
		}
	}
}

I've been reading the proposal for inheriting a caller's actor isolation, and it seems really powerful. It seems similar in spirit, but doesn't cover my issue (I think).

Would this proposal handle this scenario? I'm not sure if I understand it well enough to say, but am very curious.

Okay, so the idea is that the object is temporary within some larger operation, but it involves kicking off some asynchronous work, and that asynchronous work needs to access some state that’s shared with the original context? In some way that’s more than just setting the result of the operation, because otherwise you could just use an async let?