How to make a copy-on-write struct Sendable and thread-safe?

cal · July 30, 2024, 5:58pm

What's the best way to make a copy-on-write struct Sendable and thread-safe?

I know copy-on-write standard library types like Array are Sendable. This is marked with an @unchecked Sendable conformance in the standard library, which makes sense.

Take a copy-on-write struct like this implemented using the common isKnownUniquelyReferenced pattern:

public struct MyCopyOnWriteStruct {
  public init() {}

  public var value: Int {
    get {
      storage.value
    }
    set {
      copyStorageIfNecessary()
      storage.value = newValue
    }
  }

  private var storage = Storage()

  private mutating func copyStorageIfNecessary() {
    if !isKnownUniquelyReferenced(&storage) {
      storage = storage.copy()
    }
  }
}

fileprivate final class Storage {
  init() {}
  var value = 0

  func copy() -> Storage {
    let copy = Storage()
    copy.value = value
    return copy
  }
}

Is MyCopyOnWriteStruct thread-safe? If not, is there an alternative approach for implementing a copy-on-write type that is thread-safe?

If the copy-on-write implementation is thread-safe, then it should be reasonable to mark the copy-on-write type as @unchecked Sendable. (Assuming the values stored within the type are also Sendable).

vanvoorden · July 30, 2024, 7:32pm

https://mjtsai.com/blog/2019/02/06/why-swifts-copy-on-write-is-safe/

Here are some thoughts from Michael Tsai on the thread-safety of isKnownUniquelyReferenced. This implies that the isUnique check can be robust when performed across multiple threads… but another question on top of that is will one instance be thread-safe when multiple threads are attempting to mutate the same copy. If we take it for granted that we pass CoW struct instances across concurrency domains with a copy operation… then yes… that copy operation will increment the reference count (which triggers a new storage reference on the next mutation).

If we had some way to pass this across concurrency domains without a copy (like passing as an inout)… there would need to be built-in protections to the CoW type to serialize reads and writes. It's possible that Swift 6 concurrency might have extra warnings or errors if an engineer tries to pass around an instance like that… but there might be unsafe workarounds where it is still possible.

To put it another way… if we pass collections like Array (or a custom CoW type) across concurrency domains with a pass-by-value copy then AFAIK we can trust isKnownUniquelyReferenced to do the correct thing for us. If we try to pass a collection like Array (or a custom CoW type) across concurrency domains with a pass-by-reference (like inout) then isKnownUniquelyReferenced does not protect us anymore.

cal · July 30, 2024, 8:18pm

Thanks! This makes sense to me!

So, isKnownUniquelyReferenced is thread-safe except in races on an individual variable. For example:

var x = [ 1, 2, 3 ]
q.async( x.append(4) )
q.async( x.append(5) )

However, simultanous access to the same variable is completely disallowed in Swift 6 with strict concurrency checking:

var x = [ 1, 2, 3 ]
q.async( x.append(4) ) // error: mutation of captured var 'x' in concurrently-executing code
q.async( x.append(5) ) // error: mutation of captured var 'x' in concurrently-executing code

This means that it is not possible to have a data race in a copy-on-write type implemented isKnownUniquelyReferenced when using strict concurrency checking. Or, in other words, that a type copy-on-write type like this is Sendable.

This isn't possible today, right?

I see strict concurrency checking prevents this for inout today like you would expect:

import Foundation
let otherQueue = DispatchQueue(label: "test")

@MainActor
func testNonIsolated() {
  var array = [10]

  otherQueue.async {
    test(&array) // error: mutation of captured var 'array' in concurrently-executing code
  }
}

func test(_ array: inout [Int]) {
  array.append(1)
  print(array)
}

vanvoorden · July 30, 2024, 9:00pm

Well… yes (and no). When we talk about whether or not an arbitrary CoW type is (or is not) "thread-safe" I believe we are talking about the run-time behavior of the type (does the type have a way to serialize reads and writes). This is (often) closely related to (but may be orthogonal to) the compile-time behavior of the type.

If the Swift 6 Strict Concurrency compiler (in its default behavior) prevents this type from being used in an unsafe way… does that make it thread-safe? I think that might depend on how you semantically define thread-safety. If an engineer is blocked on using a type in an unsafe way… is the inherent "thread-unsafety" of the type a moot point?

My opinion here is that thread-safety of a type is something more primitive than the compile-time guardrails modern Swift gives to us. If an engineer shuts down (or works around) those guardrails and then mutates the type across threads… is our type thread-safe then? For example:

@MainActor
func testNonIsolated() {
  nonisolated(unsafe) var array = [10]

  otherQueue.async {
    test(&array)
  }
}

An engineer would usually have the option to mark any CoW value-type as nonisolated and then pass-by-reference directly in a way that could lead to a data-race. If Array was "canonically" thread-safe… these mutations across threads would also be safe.

vns · July 30, 2024, 9:24pm

I'd say yes. Concept of thread-safety is quite broad and is not limited to the ability of the type protect its state internally. Swift kinda unique in that way as well, bringing value types on the plate, that remove shared which is so common with classes (passing class between threads vs passing struct has two completely different meanings). And being Sendable for the type actually means to be thread-safe.

Given that, IMO example compares different concurrency issues. If instead of variable of Array type you'd have variable of an actor type, you'll still have concurrency issues on modification. Would that mean that actor type isn't thread safe?

jrose · July 30, 2024, 9:28pm

The other piece is being careful about consistently using isKnownUniquelyReferenced, and correctly handling the false case. If there are bugs in that, all bets are off! You might accidentally have shared mutable state then. (The rule here is that every mutation can only happen on known-uniquely-referenced storage, and a subtle part of guaranteeing that is ensuring that the storage class instance does not ever escape its container to gain weak references.)

Today, the compiler doesn’t give you the tools to guarantee safety when building a CoW type (short of stepping outside the guards as vns mentioned). People have made some nifty wrappers before to help with it, and Thread Sanitizer will probably catch most mistakes you might make if your test suite is filled out to race mutations from multiple threads.

vanvoorden · July 31, 2024, 6:35am

This actually got me thinking… I have a similar pattern for my own CoW types… but I'm seeing a (potential) missing piece here WRT Sendable.

Suppose you start with your example and then try to make the CoW Sendable. Here is the error:

public struct MyCopyOnWriteStruct : Sendable {
  public init() {}

  public var value: Int {
    get {
      storage.value
    }
    set {
      copyStorageIfNecessary()
      storage.value = newValue
    }
  }

  private var storage = Storage()
  //          `- error: stored property 'storage' of 'Sendable'-conforming struct 'MyCopyOnWriteStruct' has non-sendable type 'Storage'

  private mutating func copyStorageIfNecessary() {
    if !isKnownUniquelyReferenced(&storage) {
      storage = storage.copy()
    }
  }
}

fileprivate final class Storage {
//                      `- note: class 'Storage' does not conform to the 'Sendable' protocol

  init() {}
  var value = 0

  func copy() -> Storage {
    let copy = Storage()
    copy.value = value
    return copy
  }
}
  private var storage = Storage()

  private mutating func copyStorageIfNecessary() {
    if !isKnownUniquelyReferenced(&storage) {
      storage = storage.copy()
    }
  }
}

fileprivate final class Storage {
  init() {}
  var value = 0

  func copy() -> Storage {
    let copy = Storage()
    copy.value = value
    return copy
  }
}

Ok… what if we mark Storage as Sendable?

fileprivate final class Storage : Sendable {
  init() {}
  var value = 0
  //  `- error: stored property 'value' of 'Sendable'-conforming class 'Storage' is mutable

  func copy() -> Storage {
    let copy = Storage()
    copy.value = value
    return copy
  }
}

Almost there… one more step:

fileprivate final class Storage : Sendable {
  init() {}
  nonisolated(unsafe) var value = 0

  func copy() -> Storage {
    let copy = Storage()
    copy.value = value
    return copy
  }
}

Ok… no errors. All good? What if we go on vacation and a new engineer comes here to add a new value to our CoW (trying to follow the same pattern we just did)?

final public class Item {
  var value: Int = 0
}

public struct MyCopyOnWriteStruct : Sendable {
  public init() {}

  public var value: Int {
    get { storage.value }
    set {
      copyStorageIfNecessary()
      storage.value = newValue
    }
  }
  
  public var item: Item {
    get { storage.item }
    set {
      copyStorageIfNecessary()
      storage.item = newValue
    }
  }

  private var storage = Storage()

  private mutating func copyStorageIfNecessary() {
    if !isKnownUniquelyReferenced(&storage) {
      storage = storage.copy()
    }
  }
}

fileprivate final class Storage : Sendable {
  init() {}
  nonisolated(unsafe) var value = 0
  nonisolated(unsafe) var item = Item()

  func copy() -> Storage {
    let copy = Storage()
    copy.value = value
    return copy
  }
}

Hmm… no errors… what if we put these same properties in a type that is not Cow?

public struct MyStruct : Sendable {
  public var value = 0
  public var item = Item()
  //         `- error: stored property 'item' of 'Sendable'-conforming struct 'MyStruct' has non-sendable type 'Item'
}

Ahh… there it is. By marking our Storage instance variables as nonisolated(unsafe)… we opt-out of strict concurrency checking and enable our Storage to be Sendable. But this is a big hammer… we would actually prefer something with a little more control behind it.

We want our instance variable to be unsafe WRT our own access (so we opt out of Strict Concurrency checking).
We want our instance variable to be safe WRT the underlying Sendable conformance of the type of the instance variable itself (so we do not opt out of Strict Concurrency checking).

Which gives us a dilemma… do we make this type compatible with Sendable (and let the engineer manage the risk of losing Strict Concurrency errors when passing in a type that is not Sendable)?

As of right now… I'm not sure we can get one without the other. It feels like we are missing one additional dimension of nonisolated specificity. Hmm… any ideas about how else to work around this in 5.10?

vns · July 31, 2024, 7:07am

For that case there should be a set of test cases that covers sendability and ensures that type have stayed thread-safe, as there is unsafe opt-out from concurrency checks.

Not sure if it is possible to diagnose with Swift (might be, as it is possible to write Swift code that diagnoses it), but I have had an idea of a SendableGuard type for such cases, as it boils down to whether the underlying type is Sendable (inspired by Array that relies on that):

struct SendableGuard<T>: Sendable where T: Sendable {
    var value: T
}

fileprivate final class Storage : Sendable {
  init() {
  }

  // that's fine, `Int` is `Sendable`
  var value: Int { 
    get { valueGuard.value } 
    set { valueGuard.value = newValue } 
  }
  private nonisolated(unsafe) var valueGuard = SendableGuard<Int>(value: 0)

  // these both properties raise an error in Swift 6 of `Item` not being `Sendable`
  var item: Item { 
    get { itemGuard.value }
    set { itemGuard.value = newValue }
  } 
  private nonisolated(unsafe) var itemGuard = SendableGuard<Item>(value: Item())

  func copy() -> Storage {
    let copy = Storage()
    copy.value = value
    return copy
  }
}

vanvoorden · July 31, 2024, 7:59am

Unit tests would be good… but I'm not sure what a run-time unit-test to check for Sendable even looks like exactly. We know from SE-0302:

A marker protocol cannot be named as the type in an is or as? check (e.g., x as? Sendable is an error).

This is an interesting idea… but I'm also mindful of the performance impact from introducing a new type (and forwarding the getters and the setters through that new type). But it's an interesting idea… does it even need to be a type? What if it was just a function?

fileprivate func sendableGuard<T>(_ value: T) -> T where T : Sendable { value }

final public class Item {
  var value: Int = 0
}

public struct MyCopyOnWriteStruct : Sendable {
  public init() {}

  public var value: Int {
    get { sendableGuard(storage.value) }
    set {
      copyStorageIfNecessary()
      storage.value = newValue
    }
  }
  
  public var item: Item {
    get { sendableGuard(storage.item) }
    //    `- error: type 'Item' does not conform to the 'Sendable' protocol
    set {
      copyStorageIfNecessary()
      storage.item = newValue
    }
  }

  private var storage = Storage()

  private mutating func copyStorageIfNecessary() {
    if !isKnownUniquelyReferenced(&storage) {
      storage = storage.copy()
    }
  }
}

fileprivate final class Storage : Sendable {
  init() {}
  nonisolated(unsafe) var value = 0
  nonisolated(unsafe) var item = Item()

  func copy() -> Storage {
    let copy = Storage()
    copy.value = value
    return copy
  }
}

That function should (I hope) also be inlined at compile-time so we are net-neutral for instructions (no additional work) and no additional binary size or memory for a new type.

vns · July 31, 2024, 8:25am

I meant to ensure thread-safety in general. Not an easy task, of course, but possible. Checking for Sendable conformance, even if that would be possible, probably still more esoteric version

For me type seems better choice since you effectively restrict what can be used at the storage level, while with function more effort needs to be put into ensuring it is called within every getter. Of course, with the type one can also simply don't use it, but I'm sure you can play with it a bit more and make compiler force to use it (macros could've be useful here?), or at least here we can actually make unit tests that would check all the Storage properties types.

As for performance impact, it is nuanced theme IMO. There probably are the ways to neglect wrapping in a struct effect (if there is any) if it would be needed.