[Pitch] `Disconnected` type for modeling disconnected values

I've seen a couple of posts anti-using Disconnected for the name, now. I'd like to argue in favor of it:

Fundamentally, this is about preserving region-based isolation in situations where the language/compiler can't. The region-based isolation proposal refers to this case of "a value eligible for sending" as "being in a disconnected region". The Disconnected type can be viewed as preserving this property of "being in a disconnected region".

So far the most promising alternate name seems to be Sending<...>. But I find this confusing — Sending<CopyableNonsendable> is Sendable (but not Copyable). Yes, you sending the value in and sending the value back out again, but the point of the type is to trade that sending for a more composable behavior. At rest, as Mutex<UniqueArray<Sending<CopyableNonsendable>>>, the name makes little sense to me. The thing is being stored, not sent.

I've definitely encountered situations where such a type would be useful. Not often, but it has happened. I'm pretty sure I've used nonisolated(unsafe) to side-step the issue. It would be nice to not need to do that.

Up to now, I have thought of region-based isolation as an implementation detail. It is an internal mechanism the compiler uses to reason about the concurrent behavior of values. But it is not (to my knowledge) part of the abstraction that the language presents to the programmer.

"What is a "connected" type?"
"What is it connected to?"
"What is a region?"
"Are isolation and regions different?"

It seems to me like these are some of the questions that would necessarily need answering, at a minimum for the documentation of this type. This feels like entirely new concepts that users would have to internalize. And, I think it is tempting to believe that this is already the case. But I don't think that this is actually necessary to reason about the programming model. In fact, I think it is more demanding. It's (admittedly, non-trivial) advantage is it matches current implementation of static analysis.

I'm not arguing against the name. I just wanted to articulate some implications that were not obvious to me when I first saw this proposal.

1 Like

I think Disconnected would make more sense as a name if Swift exposed any notion of connected isolation regions in any of its tooling. If the user could see the connections, the notion of a disconnected region would make a lot more sense. Perhaps @jamieQ's investigations would lead to something to help with this.

1 Like

I think these are all answered by the RBI proposal, and all required understanding to know when a non-Sendable value can be passed to a sending parameter (or why previously-working code that does so might stop compiling after adding another statement to the end of the block).

To be clear: I don't think that RBI was a good idea, and I don't think that being able to sending disconnected non-Sendable values was a good idea. But that ship has sailed, and these concepts are now fundamental to understanding how Swift concurrency works.

As for the utility of this type: absolutely, it's marginal. It comes up mostly as a building-block for new abstractions, allowing you to create API that accepts sending instead of Sendable in more cases. From my perspective, it'd make sense to put this into the Synchronization module so it's not an automatic import — this isn't a tool most people will ever need to reach for.

1 Like

I do not agree that they are required to understand what's going on, because I myself do not understand them but I do feel like I can follow along here.

I think all that is missing from the current implementation is sufficient diagnostic information to help trace back to the last point where a transfer is no longer possible. But overall, I think it was a tremendously useful addition.

1 Like

If the compiler exposes the necessary information here, that would be great! I'm up for any and all expanded feedback mechanisms.

But I'm extremely wary of surfacing RBI-specific concepts in user-facing diagnostics. The conceptional burden is already high, and I think it is possible to communicate these states using existing terminology. At the very least, I think we should make a real attempt first.

1 Like

Correct me if I’m wrong, but as far as I’m aware, this section of the Swift Migration Guide is the only public-facing documentation we have for RBI. (I don’t consider evolution proposals nor diagnostic groups to be public-facing documentation.) And it doesn’t even mention “Disconnected” at all. Not that the proposed type shouldn’t be named “Disconnected,” since that is the correct term.

P.S. Can we stop hiding the rather excellent Swift Migration Guide and promote it more? Perhaps with more descriptive names than simply “Migrating to Swift 6” and “Data Race Safety,” since it’s our most in-depth documentation on Swift concurrency.

1 Like

Thanks for the healthy discussion so far. I wanna share my perspective on a few of the raised points.

I personally think that region-based analysis cannot be understood as just an implementation detail of the compiler. The goal is definitely to make writing concurrent code as easy as possible. In particular, when working with non-Sendable types. Region-based analysis eliminates a lot of unnecessary Sendable conformances or usage of synchronization primitives, but ultimately there is code that is not data race safe, and even region-based analysis won't help. Once a developer writes such code, they need to understand what went wrong. Moreover, they need to be able to reason about what the compiler thinks about which value is tied to which region. Over the past couple of years, I have helped numerous people to work through their data race errors, and I found that the only way to reason through these errors often is to think "like the compiler" about the values and their regions. Having said that, I personally fully agree that we should:

  • Improve the documentation around Concurrency and region-based isolation
  • Improve source tooling to surface region information
  • Improve the diagnostics to clearly articulate when a value is being tied to a region and why we need a disconnected value
  • Introduce missing language features such as call-once closures that hinder the effectiveness of RBI

Even with all of the above, the Disconnected type is still a necessary abstraction when working with generic code, especially asynchronous data structures and algorithms. The fact that this same concept has been copied around in multiple packages across the Swift ecosystem and multiple people in this thread alone having shared their own version of this type provides pretty strong signal that this is a missing piece.

I share the same opinion about the alternative proposed names. Neither Sending, Sent, Isolated, nor IsolatedRegion strike me as particularly good names for this concept. I understand that we haven't used Disconnected much in documentation yet, and it has mostly been used in evolution proposals or forum threads, but in my opinion, this is the term-of-art for this concept. Some alternatives that use the disconnected term but make it more specific could be DisconnectedRegion or DisconnectedIsolationRegion. However, I feel like they become too wordy.

I like the idea of putting this type into the Synchronization module since it is similar to Mutex in the way that Mutex is also modeling its own isolation region with its usage of inout sending in the withLock method.

9 Likes

Unfortunately, the compiler already implicitly surfaces region information and produces region-based errors, so it's too late. @FranzBusch already covered the general why, but to give a more precise example (from PointFree's recent videos on this very subject).

func regions() {
    let first = Account()
    let second = Account()
    
    takeBoth(first, second)
    
    Task {
        first.balance += 5
    }
    
    second.balance += 10
}

final class Account {
    var balance = 0
}

func takeBoth(_ first: Account, _ second: Account) {
    print(first, second)
}

The diagnostics produced in this code are ultimately accurate, given the current limitations of region analysis, but are actively misleading.

First, it highlights the Task init and produces:

Sending value of non-Sendable type '() async -> ()' risks causing data races

This is rather confusing as it is, since the base error message doesn't include the name of the value it thinks I'm sending, and even assuming it's talking about first, is possibly confusing with my other, similar usage, where passing a reference type to a Task works just fine.

In Xcode and the command line you can get a bit more info:

Tester.swift:19:5: error: sending value of non-Sendable type '() async -> ()' risks causing data races
Task {
^~~~~~
Tester.swift:19:5: note: Passing value of non-Sendable type '() async -> ()' as a 'sending' argument to initializer 'init(name:priority:operation:)' risks causing races in between local and caller code
Task {
^
Tester.swift:23:5: note: access can happen concurrently
second.balance += 10

But this is even more confusing, as somehow the compiler thinks I'm sending second to the Task?

Xcode's "Show" visualization makes this even worse. (Pardon the screen shot.)


It highlights second, instead of first, which is actually captured by the Task that produces the error. What the heck is going on?

Thankfully, the latest nightly has better diagnostics, though still not quite right. (via Godbolt)

10 |     
11 |     Task {
12 |         first.balance += 5
   |         `- error: closure passed as an argument to a 'sending' parameter captures 'first' which is accessed later by code in the current task [#RegionIsolation::SendingRisksDataRace]
13 |     }
14 |     
15 |     second.balance += 10
   |     `- note: access can happen concurrently
16 | }
17 | 

[#RegionIsolation]: <https://docs.swift.org/compiler/documentation/diagnostics/region-isolation>
[#SendingRisksDataRace]: <https://docs.swift.org/compiler/documentation/diagnostics/sending-risks-data-race>

Even this new diagnostic isn't right. The message on first says it's "accessed later by code in the current task", but that's clearly not true, even in diagnostic itself. Instead, it later points to the use of second and says "access can happen concurrently", which also isn't true, as second is only accessed outside the Task. But at least there are some links to region isolation that might help point to the real issue.

For those more familiar with region based isolation and its various rules, the actual issue in this code, which is never mentioned in any of the diagnostics, is that takeBoth(first, second) merges the isolation regions of first and second, making subsequent sending captures of one act like both were sent instead. The critical notion of region merging is never mentioned in the diagnostics! (Unfortunately there doesn't seem to be a great way, even in the nightlies, to tell the compiler this is safe.)

So not only is region isolation already exposed by the compiler, we need to expose it more to make errors clear and help the user come up with different solutions.

5 Likes

I feel like DisconnectedRegion is a good compromise. It immediately gives less experienced users a hook to search for and prevents even more experienced users from having to remind themselves exactly what's being disconnected. And if this is going to be a less used API that lives in Synchronization, a more verbose name seems fine.

2 Likes

I have to apologize here. I think I have not been clear and have managed to confuse the issue considerably.

What I intended on saying was that the terminology currently in use by both documentation and diagnostics is "sending", "send", "Sendable", "non-Sendable", and "isolation". These are the terms I think in.

What I was trying to express concern about was the the terms "connected", "disconnected", "merge", and "region". I didn't realize that #RegionIsolation appears in the diagnostic note there, so there does at least appear to be some precedence.

Edit: I forgot! I think it is particularly notable is the term "domain" is also used in documentation. I do not look forward to explaining the differences between a "region" and a "domain".

But I really do want to be clear that I am not objecting in the slightest to surfacing these problems more clearly. That would be extremely welcome.

I have been using a Disconnected type based on @KeithBauerANZ's type for some time. I have found it very useful, and I think that its implementation is non-obvious enough that it should be included in Swift. It appears to be very similar to the UniqueBoxproposal SE-0517: UniqueBox . But it has a consuming sending initializer and returns sending values. These are different axes in the Swift ownership model, so I think that they both deserve to be included in Swift.

I think that the type should not use @unchecked Sendable or nonisolated(unsafe). I have tested it and found that withMutableValue needs to reinitialize self if the body throws.

// It should *not* be `@uncheck Sendable`, because then it could be stored in a Sendable reference type and accessed concurrently.
public nonisolated
struct Disconnected<Value: ~Copyable>: ~Copyable //, @unchecked Sendable
{
	//nonisolated(unsafe)
	private var value: Value
	
	public init(_ value: consuming sending Value) {
		self.value = value // `consume` a disconnected value. It is proven to dominate its subgraph at this point
	}
	
	public mutating func exchange(_ v: consuming sending Value) -> sending Value {
		let old: Value = self.value // partial consumption
		self = .init(v) // full reinit
		return old // return disconnected
	}
	
	public consuming func consume() -> sending Value {
		nonisolated(unsafe) let value = self.value
		return value
	}
	/// I think this is safe because Disconnected is non-Copyable and withMutableValue is `mutating`. `mutating` is an exclusive mutable borrow of self, granting exclusive access to the memory for the duration of the call. withMutableValue is syncronous, and so is body. Disconnected cannot be accessed concurrently and the access is exclusive, so it should be data-race safe and memory safe.
	/// the parameter to body is `sending` so it is assumed to dominate its subgraph at the time it is passed in and it must be proven to dominate its subgraph inside of the body call for body to be valid.
	 public mutating func withMutableValue<R: ~Copyable, E: Error>(_ body: (inout sending Value) throws(E) -> sending R) throws(E) -> sending R {
		nonisolated(unsafe) var v: Value = consume self.value // partial consume
		do {
			let r: R = try body(&v) // inout sending
			self = .init(v) // full reinit; consume v
			return r
		} catch {
			self = .init(v) // must reinit self
			throw error
		}
	}
	
	// Edit: I would include this too.
	public borrowing func withValue<R: ~Copyable, E: Error>(_ body: (borrowing Value) throws(E) -> sending R) throws(E) -> sending R {
		try body(value)
	}
	
	/*
	/// DEBUGGING CONCLUSION: It must reinitialize self if body throws an error.
	/// nonisolated(unsafe) overrides data race safety checks for `value`.
	 
	public mutating func withMutableValue<R: ~Copyable, E: Error>(_ body: (inout sending Value) throws(E) -> sending R) throws(E) -> sending R {
			let r: R = try body(&self.value)
			return r
	}
	*/
	public mutating func take<U: ~Copyable>() -> sending Value where Value == U? {
		let old: Value = self.value
		self = .init(nil)
		return old
	}
	public mutating func tryTake<U: ~Copyable>() throws(DisconnectedError) -> sending U where Value == U? {
		switch consume value {
		case .none: self = .init(nil); throw DisconnectedError.optionalIsNil
		case .some(let unwrapped): self = .init(nil); return unwrapped
		}
	}
	func isEmpty<U: ~Copyable>() -> Bool where Value == U? {
		value == nil
	}
}
public enum DisconnectedError: Error {
	case optionalIsNil
}


All the methods that access the value are mutating or consuming, so although it can be stored in a Sendable reference type, nothing useful can be done with it whilst it's there. I think it's still safe to be Sendable.

withValue needing an explicit reinitialization I'm less sure about. Since withValue passes the value inout sending, I think that that closure should be forced to reinitialize the value even on the throws path, or inout sending + throws would be generally unsafe. But maybe that is a language bug/RBI oversight? Regardless, it seems like a language-level fix? If withValue compiles when the property isn't marked nonisolated(unsafe), it should be safe when it is.

You might be right that it could be safe with @unchecked Sendable. I looked at my old tests and now they have errors in the 6.3.2 toolchain.

var dis2 = Disconnected(NonsendableCopyableStruct())

let c2: (inout sending NonsendableCopyableStruct) -> sending NonsendableCopyableStruct = { v in

	let r = v // it copies `let r = copy v`

	//v = .init() // it should make me reinitialize because 'v' is `sending`; but it doesn't

	return r // swift 6.3 error: Returning 'r' risks concurrent access to 'inout sending' parameter 'v' as caller assumes 'v' and result can be sent to different isolation domains

}

let escaped2 = dis2.withMutableValue(c2)



var dis3 = Disconnected(NonsendableClass())

let c3: (inout sending NonsendableClass) -> sending NonsendableClass = { v in

	let r = v // it copies the reference

	//v = .init() // it should make me reinitialize because 'v' is `sending`; but it doesn't

	return r // swift 6.3 error: Returning 'r' risks concurrent access to 'inout sending' parameter 'v' as caller assumes 'v' and result can be sent to different isolation domains

}

let escaped3 = dis3.withMutableValue(c3)

But I am still skeptical. To me, the question is "why would I want it to be Sendable if it doesn't need to be?"

If you were to store a Disconnected in a Sendable type, the Disconnected would have to be immutable. So for most kinds of objects, there wouldn't be all that much that you could do with it that you couldn't do with a snapshot of the object. You are making it non-Copyable by wrapping it in Disconnected, and you are making it immutable by making it a let property (required by Sendable). So the object is only accessible through a borrowing withValue method. It is a frozen object.

But consider if I had a resource which is non-Sendable and doesn't outwardly expose anything mutable. It could expose the resource. Without @unchecked Sendable it won't let me misuse the resource in this way. Here is why I think it would be better to not make it Sendable:

// if it is Sendable
public nonisolated
struct Disconnected<Value: ~Copyable>: ~Copyable , @unchecked Sendable

{

...

	public borrowing func withValue<R: ~Copyable, E: Error>(_ body: (borrowing Value) throws(E) -> sending R) throws(E) -> sending R {

		try body(value)

	}

}

final class SendableClass: Sendable {

	let sendableNonsendable: Disconnected<OpaquePointer?>


	init(nonsendable: consuming sending Disconnected<OpaquePointer?>) {

		self.sendableNonsendable = nonsendable

	}


	func doSomething() {

		let result_code = sendableNonsendable.withValue{ sqlite3_close_v2($0) }

	}

}



actor A {

	func useNS(_ obj: SendableClass) {

		Task {

			var stmt: OpaquePointer?

			var resultCode = obj.sendableNonsendable.withValue{sqlite3_prepare($0, "SELECT 1;", -1, &stmt, nil)}

			resultCode = sqlite3_step(stmt)

		}

	}

}



func test() {

	var ptr: OpaquePointer?

	let res = sqlite3_open_v2("", &ptr, -1, nil)

	let ref = SendableClass(nonsendable: .init(ptr))

	let a = A()

	Task {

		var stmt2: OpaquePointer?

		let resultCode = ref.sendableNonsendable.withValue{sqlite3_prepare($0, "SELECT 2;", -1, &stmt2, nil)}

		await a.useNS(ref) // prepares and executes a statement on a Task spawned by the actor

		ref.doSomething() // closes the connection. data race!

	}

}

To me, the Sendability is the entire purpose. It's a vehicle to smuggle something non-Sendable (but provably sending) through generic code, potentially through some storage or a buffer, and retrieve it (still provably sending) out the other side.

Earlier in the thread I used Mutex<UniqueArray<Disconnected<Nonsendable>>> as an example of what I want to do — I can use this to create a concurrent async stream of sending values, not just Sendable values.

That's a fair point. I think we want different things though. If you want it for smuggling objects though legacy apis, then I would think that is probably not something that should be part of Swift. You should be free to make that, but I wouldn't argue that Swift should provide that for you. On the other hand I would argue that the non-Sendable version should be provided by Swift. I think it is safe, and it fills a necessary role, one which was actually specified in the original proposal for Region Based Isolation (although the document said it should be Sendable).

I haven't tried using it for making a custom async stream, so I don't really know how you are using it. It sounds to me like you want something that is explicitly unsafe and is used to bypass a Sendability check.

Whereas I want something that enforces safety while still letting me store non-Sendable values and move them around afterward. I have used it to create unique capabilities which must be leased.

Regardless, it seems like a language-level fix?

I think that the specific problem of needing to reinitialize self after throw could be a language level bug. There certainly are some bugs in this area still. But I think that the compiler may be enforcing something when you don't make it nonisolated(unsafe). I'm particularly weary of how it handles closures.

If withValue compiles when the property isn't marked nonisolated(unsafe), it should be safe when it is.

This is probably true. But I think it is hard to prove. I still used nonisolated(unsafe) var v = consume self.value. Ultimately Disconnected is a proof object, it proves that its value is disconnected. I feel like limiting the scope of the unsafeness to a local variable makes it a stronger proof, since I can't rigorously prove every edge case. I want it to be difficult to misuse since I can't prove that it can't be misused.

I think that your difference of opinion about this is very revealing about the current state of Swift. It's very much in an intermediate stage where things are starting to come together in a way that makes cool new functionality possible, but it is still rough around the edges and there are many gaps. I often find myself wanting something other than AsyncStream, to avoid its locks. It is a multi-consumer multi-producer stream, so it has locks. But I haven't been very successful at making a single-consumer multi-producer stream which doesn't use an underlying async stream or locks. I think it should be possible, and it would use Disconnected, but I haven't tried again recently.

I want to say to you again, I am very grateful to you for sharing your code with me. Thanks!

I changed my mind, I would actually argue for both Sendable and non-Sendable implementations. They are both useful for different purposes. The Sendable one should explicitly be called 'Unsafe'.

I was thinking about the name, and I thought that I had to go digging very deep through the Swift Evolution documents to understand the concept of “disconnectedness”. The compiler errors usually refer to isolation domains, and sending.

I tried to think of some alternative names that might make it easier to find:

  • SendingProof and UnsafeSendingProof
  • Iso and UnsafeIso
  • or Isolated, but I think that might be confusing to read
  • RegionProof
  • Nonisolated

I don't get this; the Sendable version can be used anywhere the non-Sendable one can.

But it's not unsafe?

1 Like

You're right it's not unsafe. If the Sendable version doesn't have a borrowing withValue then it should be safe. I introduced the hole by borrowing. I didn't quite realize the importance of that at first.

You can either have Sendable or borrowing, but not both at once; or else it becomes unsafe in the sense that it's up to you to not store it in a Sendable reference type which is accessed concurrently. mutating and consuming should be safe so for most uses both Sendable and non-Sendable should be equivalent. If you need borrowing, you need non-Sendable.

I still don't understand why you need Sendable for your Mutex<UniqueArray<Disconnected<Nonsendable>>>. Could you please explain?

I tried making something a bit like what you are talking about. I think it should be an optional Nonsendable? and use take(). I used another Disconnected in the push method to move the item into the mutex closure, then unwrapped the item to prove it is disconnected from the closure, and wrapped it in another Disconnected again to store it. It doesn't need a Sendable Disconnected.
I'm not quite sure what you are doing with an array, it's probably more complex, but I think the same principle should work even if you want to access slices.

final class Channel: Sendable {
	private let mutex: Mutex<MiniDeque<Disconnected<NonSendable?>>>
	
	init(capacity: Int) {
		self.mutex = Mutex(MiniDeque(capacity: capacity))
	}

	func push(_ value: consuming sending NonSendable) -> Bool {
		var staged = Disconnected(Optional(value)) // wrap it to move it into the mutex disconnected

		return mutex.withLock { buffer in // staged is captured by the withLock closure,
			// so it is isolated (connected) to the closure.
			let fresh: NonSendable? = staged.take() // unwrap to disconnect the value from the closure,
			let wrapped = Disconnected(fresh) // then wrap it again for the buffer
			
			return buffer.append(wrapped)
		}
	}

	func pop() -> sending NonSendable? {
		mutex.withLock { buffer in
			var dis = buffer.popFirst()
			return dis?.take()
		}
	}
}

struct MiniDeque<Element: ~Copyable>: ~Copyable {
	private let storage: UnsafeMutableBufferPointer<Element>
	private var head: Int = 0
	private var count: Int = 0

	init(capacity: Int) {
		precondition(capacity > 0)
		self.storage = UnsafeMutableBufferPointer<Element>.allocate(
			capacity: capacity
		)
	}

	deinit {
		
		for i in 0..<count {
			let j = (head + i) % storage.count
			storage.deinitializeElement(at: j)
		}
		storage.deallocate()
	}

	var isEmpty: Bool { count == 0 }
	var isFull: Bool { count == storage.count }

	mutating func append(_ value: consuming Element) -> Bool {
		guard !isFull else { return false }

		let tail = (head + count) % storage.count
		storage.initializeElement(at: tail, to: value)

		count += 1
		
		print(status())
		return true
	}

	mutating func popFirst() -> Element? {
		guard !isEmpty else { return nil }

		let result = storage.moveElement(from: head)

		head = (head + 1) % storage.count
		count -= 1

		print(status())
		return result
	}
	
	func status() -> (head: Int, tail: Int, count: Int, capacity: Int) {
		(head: head, tail: (head + count) % storage.count, count: count, capacity: storage.count)
	}
}

final class NonSendable {
  var x = 0
}

Unless I’m misunderstanding, a borrowing withValue is still unsafe, even if Disconnected is non-sendable, because you could attach isolated values to the value inside the closure, which only binds the Disconnected to the region, but since consume returns sending it doesn't matter which region the Disconnected is bound to.

Here is an example that compiles:

struct Disconnected<Value>: ~Copyable {
  
  init(value: Value) {
    self.value = value
  }
  
  borrowing func withValue(
    _ body: (borrowing Value) -> Void
  ) -> Void {
    body(value)
  }
  
  func consume() -> sending Value {
     /// not sure why but the compile gives an error here here even though value is nonisolated
    fatalError()
  }
  
  private nonisolated(unsafe) let value: Value
  
}

final class NonSendable {
  var x = 0
}

final class Container {
  var nonSendable: NonSendable?
}

let isolatedToThisRegion = NonSendable()
isolatedToThisRegion.x += 1

let disconnected = Disconnected(value: Container())
disconnected.withValue { container in
  container.nonSendable = isolatedToThisRegion
}

// `sent` is `sending` regardless of `disconnected`'s isolation
let sent = disconnected.consume()
Task {
  sent.nonSendable?.x += 1
}

You can plug this hole by making the withValue closure @Sendable, which makes it so it can't capture isolated values and thus cannot have one assigned to a property.

1 Like