Exposing the Memory Locations of Class Instance Variables

lorentey · November 12, 2019, 4:30am

I think we need to introduce (at least rudimentary) support for concurrency primitives before Swift gains support for move-only types.

Indeed -- atomics and other synchronization constructs are the only use cases we care about.

No and no! However, any names we introduce now for the interim types won't be (easily) available to the eventual "proper" implementations of these concepts. This is a good argument for making these interim types not too fancy.

A set of explicitly unsafe UnsafeAtomicFoo types that are boring no-nonsense wrappers around unsafe pointers is still vastly preferable to not having atomics all -- and it would leave the AtomicFoo names available for properly move-only atomics later.

Joe_Groff:

I suppose that means you could say something similar about object instance memory, that the instance storage's ground state is unmanaged memory, and that property accessors temporarily present a managed, exclusivity-governed interface to part of the instance memory. That at least brings the problem of specifying what happens when you interleave formal property accesses with other raw forms of access down to being the same problem with pointee (which is similarly ill-specified; I wouldn't recommend people interleave pointee accesses with os_unfair_lock_* API calls on a pointer either!)

[...]

However, when you're using C or C++ to operate on those primitives, you're only using C/ObjC/C++ calls to interact with the storage for their underlying concurrency primitives, so there should not be overlapping formal Swift accesses and C operations outside of Swift's current purview on the same object . That's what I'm more concerned about.

It seems we're in full agreement here, if from slightly different viewpoints. Mixing "regular" and "atomic" access is a big no no, independent of what construct we use to implement storage. One of my base expectations is that any interim concurrency solution still needs to fully protect against such mixed access.

Custom destructors would definitely be nice to have for things like POSIX mutexes, but I don't mind waiting until move-only types to get them. The "address of ivar" approach acts as a compromise between raw memory and custom destructors. While it would not let us automatically call pthread_mutex_destroy on destruction, at least it would still give us the default nontrivial destructor behavior for things such as atomic reference types.

(It also allows debuggers and heap analysis tools to (easily) understand that these things are always initialized (like ivars), and to figure out through the regular reflection facilities if some of them hold strong references.)

This is very much possible. My problem is that this doesn't satisfy my base expectation that any API we add needs to protect against mixed use of atomic and non-atomic operations -- mixing atomic operations along with things like pointee is very much the opposite of that:

let pointer: UnsafeMutablePointer<Int> = ...
// We should only be able to spell one of these, but not both:
let value1 = pointer.pointee
let value2 = pointer.atomicLoad(ordering: .relaxed)

This seem more appropriate to be an internal implementation detail than any actual public API.

A viable alternative would be to provide atomic operations on a trivial wrapper type around a pointer value; I had implemented this previously, and while it is unsafe, I still find it vastly preferable to the "method soup" approach. It does also have the nice property that the nicer AtomicInt etc. names would remain available for the eventual move-only approach.

let pointer: UnsafeMutablePointer<Int> = ...
let value1 = pointer.pointee

let atomicInt = UnsafeAtomicInt(pointer)
let value2 = atomicInt.load(ordering: .relaxed)

If we're happy this with approach, we should find a way to let us use these types to declare properties in class types. Joe's @RawStorage and PointerToStorage constructs (names to be bikeshed) would let us do that:

struct UnsafeAtomicInt: PointerToStorage {
  typealias Storage = Int
  let _storage: UnsafeMutablePointer<Storage>
  init(storage: UnsafeMutablePointer<Storage>) {
    _storage = storage
  }
  func load(ordering: AtomicLoadOrdering) -> Storage { ... }
  func store(_ value: Storage, ordering: AtomicStoreOrdering) { ... }
  ...
}

class Foo {
  @RawStorage var counter: UnsafeAtomicInt  
}

foo.counter.wrappingIncrement(ordering: .relaxed)
print(foo.counter.load(ordering: .relaxed))

Question: how would we set an initial value for counter above? The obvious answer would be to require that @RawStorage properties get initialized with values of their Storage types.

@RawStorage var a: UnsafeAtomicInt = 42
@RawStorage var b: UnsafeAtomicInt

init() {
  self.b = 23
}

Is this weird?

Double-word atomics introduce more complications -- they sometimes distinguish between the `Storage` type and the logical `Value` that is returned by `load()`. (For example, we expect a fully general strong `AtomicReference` would use an `(T?, Int)` tuple for storage, but it would preferably load/store `T?` values.) This is mostly irrelevant at this point, except it complicates initialization, too -- ideally we'd want to use the `Value` type for initializing these rather than `Storage`. But we can live without all this: the `Value`/`Storage` distinction can definitely wait until we have move-only types.