Exposing the Memory Locations of Class Instance Variables

TellowKrinkle · November 10, 2019, 2:21am

Joe_Groff:

This is the sort of thing that it'd be nice to have explicit invocation of accessor coroutines for. If you were forced/strongly encouraged to access the property in a scoped access, using strawman syntax like this:
with lockAddr = self.anchoredProperty {
  use(lockAddr)
}

It's kind of disgusting, but we can technically already do it with this, right?

{ lockAddr in
	use(lockAddr)
}(&self.anchoredProperty)

Edit: Just realized that only works for _modify, not _read

Joe_Groff · November 11, 2019, 6:09pm

Like our other primitives, the transition from unmanaged to managed memory in these cases is temporary well-scoped. pointee and Array.subscript provides managed mutable storage backed by unmanaged memory only for the duration of the access, essentially the inverse of withUnsafePointer. They go back to being unmanaged memory when you aren't accessing them.

I suppose that means you could say something similar about object instance memory, that the instance storage's ground state is unmanaged memory, and that property accessors temporarily present a managed, exclusivity-governed interface to part of the instance memory. That at least brings the problem of specifying what happens when you interleave formal property accesses with other raw forms of access down to being the same problem with pointee (which is similarly ill-specified; I wouldn't recommend people interleave pointee accesses with os_unfair_lock_* API calls on a pointer either!)

I think we're talking past each other here. I agree with everything you say, but I see what you're talking about as an orthogonal issue to what's being discussed here. I agree, you can use C and C++ constructs to establish semantic ordering constraints that aren't yet precisely defined in Swift, and we need to more precisely define what that means, as well as unlock the ability to express these constraints in Swift. However, when you're using C or C++ to operate on those primitives, you're only using C/ObjC/C++ calls to interact with the storage for their underlying concurrency primitives, so there should not be overlapping formal Swift accesses and C operations outside of Swift's current purview on the same object. That's what I'm more concerned about.

lorentey:

But I really really wouldn't want to treat synchronization constructs as unmanaged raw storage. These things aren't built out of special artisanal bits that are only meaningful to C++ code: they contain regular everyday types that we already model in the stdlib. They should be (and already are!) initialized/destroyed as regular Swift values. os_unfair_lock_s is imported as a boring struct holding an integer ivar; it's no more special than, say, NSRange is.

The only thing that makes these types special is the operations they expose for use between their init/deinit. Given that Swift is a "high-performance system programming language", my expectation is that I should be able to implement these operations directly in Swift, and if anything, this should be less difficult than it is in C++. Requiring people to mess around with raw pointers is definitely not the right direction for that.

I agree that non-copyable types would be the appropriate abstraction to represent these constructs. Unfortunately, we aren't currently modeling them in Swift. On the other hand, just as unfortunately, we're badly overdue for adding usable atomics, and I think it would be a mistake to delay it further.

I agree we should expose the ability to implement concurrency primitives directly in Swift, but without move-only types, I think we can only at best expose them as unsafe constructs over raw storage. The best "safe" API you could build over one would be a class wrapper, because objects are our only means for unique non-copyable data with destructors, and I think that's true whether we go with the "address of ivar" approach or the "raw storage" approach. It seems to me like anyone who wants to avoid that indirection is going to have to fall back to unsafe primitives either way, until move-only types give us provide a composable model for managing these things in inline storage.

Which is not to say this is a total dead end, though—it seems to me like either mechanism would still have a place in moveonly structs as a primitive implementation mechanism, and they definitely make the situation better in the interim, since it at least becomes possible to inline raw memory into objects. I just don't think we can make it safe to do so yet.

Joe_Groff · November 11, 2019, 6:11pm

As I mentioned in my reply above, I think a mechanism similar to what Karoy is proposing would be similar to what we would want for move-only types, so it's not "only" a workaround. I would be concerned about taking up the "good names" for atomics, locks, and other standard library concurrency primitives before we get move-only types, though.

Joe_Groff · November 11, 2019, 6:12pm

TellowKrinkle:

It's kind of disgusting, but we can technically already do it with this, right?
{ lockAddr in
	use(lockAddr)
}(&self.anchoredProperty)
Edit: Just realized that only works for _modify , not _read

Sure. This should work for a _read as well, though it's hard to force the compiler not to copy and _read a temporary instead of the original memory today.

glessard · November 11, 2019, 9:47pm

I had an impression that this pitch was very much a workaround; thanks for clarifying.

If we need to wait three more years, it would suck not to have atomics.
If it lasts only one more year, the current workarounds aren't so bad.
In any case, I would vote to reserve the "good names" for the feature in its (intended) permanent form.

Joe_Groff · November 11, 2019, 9:50pm

Well, we should be able to expose atomics in one form or another as operations on UnsafePointer if nothing else. The biggest immediate issue that I see in Swift today is that there isn't a way to allocate pointable storage as part of class instances, which would be what Karoy's proposal addresses. Maybe we can still make an incrementally friendlier interface too.

lorentey · November 12, 2019, 4:30am

I think we need to introduce (at least rudimentary) support for concurrency primitives before Swift gains support for move-only types.

Indeed -- atomics and other synchronization constructs are the only use cases we care about.

No and no! However, any names we introduce now for the interim types won't be (easily) available to the eventual "proper" implementations of these concepts. This is a good argument for making these interim types not too fancy.

A set of explicitly unsafe UnsafeAtomicFoo types that are boring no-nonsense wrappers around unsafe pointers is still vastly preferable to not having atomics all -- and it would leave the AtomicFoo names available for properly move-only atomics later.

Joe_Groff:

I suppose that means you could say something similar about object instance memory, that the instance storage's ground state is unmanaged memory, and that property accessors temporarily present a managed, exclusivity-governed interface to part of the instance memory. That at least brings the problem of specifying what happens when you interleave formal property accesses with other raw forms of access down to being the same problem with pointee (which is similarly ill-specified; I wouldn't recommend people interleave pointee accesses with os_unfair_lock_* API calls on a pointer either!)

[...]

However, when you're using C or C++ to operate on those primitives, you're only using C/ObjC/C++ calls to interact with the storage for their underlying concurrency primitives, so there should not be overlapping formal Swift accesses and C operations outside of Swift's current purview on the same object . That's what I'm more concerned about.

It seems we're in full agreement here, if from slightly different viewpoints. Mixing "regular" and "atomic" access is a big no no, independent of what construct we use to implement storage. One of my base expectations is that any interim concurrency solution still needs to fully protect against such mixed access.

Custom destructors would definitely be nice to have for things like POSIX mutexes, but I don't mind waiting until move-only types to get them. The "address of ivar" approach acts as a compromise between raw memory and custom destructors. While it would not let us automatically call pthread_mutex_destroy on destruction, at least it would still give us the default nontrivial destructor behavior for things such as atomic reference types.

(It also allows debuggers and heap analysis tools to (easily) understand that these things are always initialized (like ivars), and to figure out through the regular reflection facilities if some of them hold strong references.)

This is very much possible. My problem is that this doesn't satisfy my base expectation that any API we add needs to protect against mixed use of atomic and non-atomic operations -- mixing atomic operations along with things like pointee is very much the opposite of that:

let pointer: UnsafeMutablePointer<Int> = ...
// We should only be able to spell one of these, but not both:
let value1 = pointer.pointee
let value2 = pointer.atomicLoad(ordering: .relaxed)

This seem more appropriate to be an internal implementation detail than any actual public API.

A viable alternative would be to provide atomic operations on a trivial wrapper type around a pointer value; I had implemented this previously, and while it is unsafe, I still find it vastly preferable to the "method soup" approach. It does also have the nice property that the nicer AtomicInt etc. names would remain available for the eventual move-only approach.

let pointer: UnsafeMutablePointer<Int> = ...
let value1 = pointer.pointee

let atomicInt = UnsafeAtomicInt(pointer)
let value2 = atomicInt.load(ordering: .relaxed)

If we're happy this with approach, we should find a way to let us use these types to declare properties in class types. Joe's @RawStorage and PointerToStorage constructs (names to be bikeshed) would let us do that:

struct UnsafeAtomicInt: PointerToStorage {
  typealias Storage = Int
  let _storage: UnsafeMutablePointer<Storage>
  init(storage: UnsafeMutablePointer<Storage>) {
    _storage = storage
  }
  func load(ordering: AtomicLoadOrdering) -> Storage { ... }
  func store(_ value: Storage, ordering: AtomicStoreOrdering) { ... }
  ...
}

class Foo {
  @RawStorage var counter: UnsafeAtomicInt  
}

foo.counter.wrappingIncrement(ordering: .relaxed)
print(foo.counter.load(ordering: .relaxed))

Question: how would we set an initial value for counter above? The obvious answer would be to require that @RawStorage properties get initialized with values of their Storage types.

@RawStorage var a: UnsafeAtomicInt = 42
@RawStorage var b: UnsafeAtomicInt

init() {
  self.b = 23
}

Is this weird?

Double-word atomics introduce more complications -- they sometimes distinguish between the `Storage` type and the logical `Value` that is returned by `load()`. (For example, we expect a fully general strong `AtomicReference` would use an `(T?, Int)` tuple for storage, but it would preferably load/store `T?` values.) This is mostly irrelevant at this point, except it complicates initialization, too -- ideally we'd want to use the `Value` type for initializing these rather than `Storage`. But we can live without all this: the `Value`/`Storage` distinction can definitely wait until we have move-only types.