SE-0282: Low-Level Atomic Operations

lorentey · April 18, 2020, 3:56am

We can choose to have the @RawStorage discussion right here on this thread, too, of course. (It feels inline storage is a prerequisite of usable atomics, but it isn't really part of the proposal text, and it seems largely orthogonal to atomics in general. I worry that all this talk on memory management has derailed the review, and it is scaring away folks who would want to discuss the minutiae of atomic operations.)

The problem is that we want to carve out storage space within class instances to store atomic values. To do this, we need a way to reliably retrieve the address of their memory location.

In a nutshell, I see three potential approaches:

Fix the & conversion somehow to keep the syntax everyone is trying to use right now
Add keypath-based access to storage locations, such as the MemoryLayout.unsafeAddress(of:in:) method in the original addressable ivars pitch
Add a magical attribute-based solution like @RawStorage

Let's quickly go over these one by one. One way to try fixing & would be to introduce an attribute to reject cases where the inout-to-pointer conversion doesn't use a direct storage location, such as @stableStorageLocation below:

extension Int {
  struct AtomicStorage { ... }
  static func atomicLoad(
    at address: @stableStorageLocation UMP<AtomicStorage>, 
    ordering: AtomicLoadOrdering
  ) -> Int
}

Int.atomicLoad(at: &someComputedProperty, ordering: .relaxed) 
// error: 'atomicLoad' needs directly addressable storage for 'address'

(This is in the same ballpark as the @_nonEphemeral attribute we have right now, but it produces errors, not warnings, and it allows the use of & when it happens to generate a pointer that is "safe" to escape.)

The problem, of course, is that the & syntax implies that the variable is being mutated, and write access conflicts with atomic access:

class Counter {
  var _value: Int.AtomicStorage
  
  func load() -> Int {
    // Blatant exclusivity violation: 
    // atomic access overlaps with a write access
    Int.atomicLoad(at: &_value, ordering: .relaxed) 
  }
}

I think this rules out &; saving it would require major surgery on the law of exclusivity. (We could try forcing it by saying that the inout-to-pointer conversion completes the write access before the function call begins, but that would be unlike how regular inout arguments work, and I think it would just lead to even more confusion.)

The MemoryLayout.unsafeAddress(of: \._value, in: self) idea in #2 above could get rid of the exclusivity violation, since we're free to define what sort of access (if any) it entails. However, the (very reasonable) feedback on the pitch thread was that this would be far too dangerous -- it would allow code to circumvent exclusivity checks on any ivar by simply switching to accessing it through unsafe pointer operations. So we should rather go with an opt-in approach, where ivars would be explicitly annotated with some attribute that exposes their storage location.

#3 is the obvious next step on that path: it hides the actual mechanics of extracting and passing around pointers behind an attribute that works a bit like property wrappers:

// This probably wouldn’t actually be a protocol; rather it would be a 
// compiler-enforced “shape” like @propertyWrapper.
protocol RawStorable {
  associatedtype Storage: RawStorage // comes with init(_:) and dispose()
  init(at: UnsafeMutablePointer<Storage>)
}

extension UnsafeAtomic: RawStorable {}

public class Counter {
  @raw var _value: UnsafeAtomic<Int>

  // _value is a computed property of type UnsafeAtomic<Int>
  // $_value is the underlying ivar of type UnsafeAtomic<Int>.Storage

  init(_ initialValue: Int) {
    $_value = .init(initialValue)
  }
  deinit {
    $_value.dispose()
  }
  func load() -> Int {
    _value.load(ordering: .relaxed)
  }
  // ...
}

A slightly (?) more elaborate version of this would hide $_value and let the compiler autogenerate the boilerplate-y initialization and disposal of the backing storage:

public class Counter {
  @raw var _value: UnsafeAtomic<Int>

  init(_ initialValue: Int) {
    _value = initialValue   // Note the weirdly mismatched types
  }
  // a call to dispose() is generated by the compiler at the end of deinit

  func load() -> Int {
    _value.load(ordering: .relaxed)
  }
  // ...
}

Either of these last two approaches get us some of the practical benefits of move-only types without having to wait for their implementation. When Atomic<Int> becomes a thing, my hope is that code using UnsafeAtomic this way can simply upgrade to that with a simple, (more or less) mechanical migration step:

public moveonly struct Counter {
  var _value: Atomic<Int>

  init(_ initialValue: Int) {
    _value.init(initialValue)   // How exactly are we going to spell this?
  }

  func load() -> Int {
    _value.load(ordering: .relaxed)
  }
  // ...
}

I hope this explains why I’m against naked pointer-based methods like Int.atomicLoad(at:ordering:). We will need to tackle inline storage soon, and naked pointer APIs won’t fit into the most likely design for that. If we introduce them now, we will end up also introducing something like UnsafeAtomic later, and then we’d have two separate unsafe APIs for the exact same thing.

Of course, unsafe pointer-based atomics (either through UnsafeAtomic or direct pointer apis) have some reason to exist on their own right, even after we introduce Atomic. They also interoperate with manually malloced dynamic variables, ManagedBuffer, withUnsafeMutablePointer(to:), pointers coming from C, and any of the other weird & wonderful ways people may get hold of pointer values. UnsafeAtomic has an additional long-term benefit — I expect that retrieving the address of ivars within move-only types will be similarly difficult, so the eventual move-only Atomic type will most likely still use @raw and UnsafeAtomic in its internal implementation.