Low-Level Atomic Operations

Ben_Cohen · April 7, 2020, 12:53am

Maybe the conversation has moved on, but I'd find the wedging of the word Pointer into this a needless bit of verbosity (not to mention UnsafePointerTo is a really tortured term) and much prefer the original names. The word Unsafe is right there in your face as a clear indication you need to understand what you're doing and suffices to cover all the bases including the fact that these are reference types.

jrose · April 7, 2020, 1:03am

I don't see how "unsafe" implies "reference". There's no referencyness in, say, unsafelyUnwrapped.

Ben_Cohen · April 7, 2020, 1:08am

I didn't say it implied reference. I said it implied you need to know what you are doing, including whether it's a value or reference type. That said, I don't think comparing methods and types is much of a guide either way.

lorentey · April 7, 2020, 1:32am

(I'll just innocently whisper into this parenthetical that I find Unmanaged to be a terrific name for an unsafe reference construct, and it carries neither an Unsafe honorific nor an in-your-face indication that it is referency. While we are at it, most class names do not scream "I'm a reference type, tremble before me", either.

We should have more names like Unmanaged.)

Joe_Groff · April 7, 2020, 1:39am

lorentey:

Doing this would make it much more difficult to implement wrappers like the IUO thing above:
struct UnsafePointerToAtomicImplicitlyUnwrappedOptional<Wrapped: NullableAtomic> { 
  func isNil(ordering: AtomicLoadOrdering) -> Bool 
  func load(ordering: AtomicLoadOrdering) -> Wrapped // traps on nil ... 
}
(These will be plenty difficult already, what with all the constant-constrained arguments...)

Without a standalone protocol, we'd need yet another language extension to support generalized conformance requirements:

IUO is intentionally not a first class type in the core language, so I'm not sure that abstractions that try to make it so for atomics are a compelling use case, but if someone thought so, it seems like it could be done like this:

struct AtomicIUO<O: AtomicProtocol> { ... }

extension<Wrapped> AtomicIUO where O == Optional<Wrapped> { /* optional-specific API */ }

I'll say again that anything beyond ints is really "nice to have", so I'm hesitant to double the API surface for convenience features.

lorentey · April 7, 2020, 2:01am

This pitch started out with an extremely limited set of dedicated unsafe atomic types covering the bare minimum: integers, optional mutable pointers and optional unmanaged references. It has gained a protocol-oriented design as a result of continued feedback, and I think it has become better for it.

Optional pointer/reference types has always been part of (what I consider to be) the core atomics feature set, and I think it would be a mistake to remove them.

It's definitely true that NullableAtomic increases the API surface for primitive atomics. However, given that this is a separate module, I don't feel this is particularly salient.

AtomicProtocol and NullableAtomic aren't expected to be implemented outside the standard library. What if we underscored their requirements? (Possibly keeping the storage setup/teardown methods, since user code may sometimes need to call those.)

Joe_Groff · April 7, 2020, 2:17am

Maybe we could also do what @jrose had alluded to and accept only the "core" part of the proposal into the core language as forever ABI, and keep some of the more convenience parts of it in packages until we can implement them in a way that doesn't commit us to a larger core API surface.

GalCohen · April 7, 2020, 3:13am

lorentey:

This is a feature that scratches at language limitations at practically every turn -- I think it's only barely possible to do it in the language we have, but that just makes it even more timely to get it done. For example, here is a partial list of potential language features that may make atomics work better, just from the last couple days:

Move-only types (or even non-movable types)

Addressable instance variables

Type system support for compile-time evaluable functions

Parameterized extensions

Multiple conditional conformances to the same protocol

Non-frozen but fixed-layout enum types

Context-aware property wrappers

Protocols that forbid unspecialized use

Functions that can't be called through function references

Back-deployable types and protocols

Protocol conformances with availability

Back-deployable protocol conformances

Don’t want perfect to be the enemy of good but... Are there any in here that are worth waiting longer on? Maybe some things need to be prioritized? I’m just making sure these questions are asked. We wouldn’t want to look back and regret and not being to change this feature because some other language feature wasn’t available at the time.

gribozavr · April 7, 2020, 7:46am

+1, it is a lot more trouble than it the value it brings.

I really don't want to add more digressions if I can help it, but I'll try to clarify the paragraph. (These cannot be frozen enums because that would prevent us from adding more cases, but regular resilient enums can't freeze their representation, and the layout indirection interferes with guaranteed optimizations, especially in -Onone .)
[/quote]

Thanks, that would be a good explanation!

However, I wanted to say that "regular resilient enums can't freeze their representation" seems like another language limitation, but this time it is one that we have mostly already overcome. There is already a concept of a layout constraint (swift.git/include/swift/AST/Decl.h: RequirementRepr::getLayoutConstraint()) -- it currently only exists as a generic constraint, and is only usable in the @_specialize attribute (see swift.git/test/attr/attr_specialize.swift) but with a bit of plumbing through the compiler it should be possible to allow adding layout constraints to a type. The constraint we are interested in here is LayoutConstraintKind::TrivialOfExactSize.

That's nice, sorry for assuming incorrectly!

Right -- there's no guarantee that code gets specialized. In some cases (for example, a generic method in a protocol existential) we can't specialize using our current techniques at all, unless we devirtualize the existential.

With all due respect, I don't think that's true :) Let me explain the reasoning behind the Unmanaged name.

I don't think it is recorded anywhere, but I remember that when Unmanaged was introduced, its name (and the names of the methods in it, like passUnretained) were a point of contention within the core team. The justification, IIRC, was that "unmanaged" is a term of art that only applies to references, so it does scream "I'm a reference type" if you understand what the word means. There can't be an "unmanaged value". It is also a term of art that is understood to imply "unsafe" in Swift -- because there's no such thing as a "safe unmanaged" reference.

Therefore, "unmanaged" is an adjective that semantically only applies to reference types (Unmanaged<Int> is nonsense) and it implies "unsafe" within the context of Swift. Since you don't feel like the word "unmanaged" expresses those things even though it was very much intended to, maybe it is not such a great name in the retrospect, which I think contradicts your point about "unmanaged" being a good example to follow.

I see -- makes sense. I agree, moving an atomic value is safe when it is not actually used by multiple threads. From the practical point of view though, it might not be a very useful operation, and like you said, maybe it is not safe enough in Swift. Unlike Swift, In Rust a borrow checker ensures that the thread that performs the move has unique ownership.

gribozavr · April 7, 2020, 8:01am

We should not be using the word "unsafe" as a liability waiver. Our goal must be designing the best API, not trying to figure out a way to say "I told you so, should have been more careful".

I'm arguing that the word "Pointer" is needed because UnsafeAtomicPointer and UnsafeMutablePointer are actually similar types, therefore following the existing naming pattern helps understanding where the new types fit in.

Consider the time when we get the safe Atomic<T> type. We would have Atomic<T> and UnsafeAtomic<T> available at the same time, with completely different ownership and value/reference semantics. Adding a word "unsafe" to a type should not create such a dramatic difference.

Compare UnsafeMutablePointer and UnsafeAtomicPointer to C:

const int *x;
_Atomic int *y;

(yes, I replaced "mutable" with "const" to make it a fair comparison because the default constness polarity in C is different) Both declarations have exactly the same shape, but "const" is replaced with "_Atomic". I think most people would agree that removing the * in the atomic declaration would be breaking the pattern and misrepresenting the nature of the type?

not to mention UnsafePointerTo is a really tortured term

I don't understand why you consider it to be "tortured", it sounds like plain English to me. "ptr is an unsafe pointer to an atomic integer."

The reason to use UnsafePointerToAtomic<T> instead of UnsafeAtomicPointer<T> would be to emphasize that the pointee is atomic, not the pointer value itself. Both of them are useful types, but very different types. We could conceivably have both in the library, so maybe we should disambiguate.

Karl · April 7, 2020, 10:34am

This is what I was getting at with the idea of an atomic namespace/wrapper on UnsafeMutablePointer with only core operations. I wouldn’t expect many people to use them, or for it to be the end of our atomics API. But it provides a stable and useful base to build easier-to-understand abstractions (like atomic memory locations) on top of.

It’s clear that we don’t have the language features necessary for the ideal design, so I still think that approach has merits. I think UMP.atomic.load(...) is obscure enough that people won’t abuse it.

We can then build an atomics preview package on top of those stable primitives, and that can improve as language capabilities improve.

Ben_Cohen · April 7, 2020, 3:06pm

This is exactly what it is (at least, if rephrased in a way that avoids the negative spin).

With all Swift's unsafe-labelled constructs there are so many things you have to understand before you should mess with them, and you can't fix this with naming. Their reference semantics are just the tip of the iceberg. Adding words alluding to some of them will help no-one. Instead they burden people who already know the pitfalls without aiding those who don't. There is no escaping this*: you need to read the documentation.

I'm arguing that the word "Pointer" is needed because UnsafeAtomicPointer and UnsafeMutablePointer are actually similar types

This similarity is extremely superficial. Their behavior is different, their use is different, their goals are different, their API is different. If you're saying they should have a name to emphasize this similarity, I think this is a point against, because they are not similar. They might be similar in C, but Swift is not C.

I don't understand why you consider it to be "tortured", it sounds like plain English to me.

Type names written in plain English doesn't seem like a good goal to me. To invent a personal rule of thumb, if you see a preposition in a type name, something has gone wrong.

It's also not consistent with other parts of the language. We don't have UnsafePointerToBuffer. Maybe that would be better, of course, if we had our time again. But if we did, I'd wish for UnsafeBuffer. My experience having written a lot of unsafe code is having to type that word Pointer over and over again is quite a chore, with zero benefit to beginners and experienced Swifters alike. It might even encourage me to use pointers instead of buffers because I'm sick of the long name.

* unlike base addresses of buffers... there's plenty of escaping those alas

lorentey · April 7, 2020, 9:25pm

[begin off topic rant]

I couldn't agree more. Labeling the new unsafe atomic types as pointers is misleading; these aren't pointers, they just happen to contain one.

My problem with the "term of art" backdoor in our naming scheme is that it only allows reusing other people's ideas -- it doesn't allow us to come up with with good labels on our own. Terms of art are born when someone is brave/artful enough to introduce a radical new name for something, and it is good enough to catch on. This is forbidden in Swift. (Bad example, but imagine if we didn't already have the word "byte", and someone dared to pitch it on the forums. The nerve! The obviously correct name is IntegerTypeCorrespondingToTheSmallestAddressableUnitOfMemory.)

Similarly, if someone (was it C#?) wasn't brave enough to bring the managed/unmanaged terminology into the mainstream, Unmanaged would now be called UnsafeUnownedReference (or something equally unimaginative), and the stdlib would be a little bit worse.

Swift has a bunch of novel and subtle concepts that cry out for good names; instead, we tend to use polysyllabic phrases that are laser-focused on precisely explaining a single aspect -- which isn't even necessarily the most important one in actual practice. (I don't want to pick on anyone's names in particular, but the abstractions behind @usableFromInline and @_alwaysEmitIntoClient stand out in my daily work as something other languages could probably borrow. I doubt these names will ever catch on, though. Same goes for the clinical UnicodeScalar; bolder library designers have come to call this Rune, and that choice seems to be catching.)

I'm not saying UnsafeAtomic is on par with "byte" or "string"; it's not even close. But this is the itch I was trying to scratch with my "Handle" suggestion.

[end off topic rant]

Phew. Having said all that, I don't feel this is the right topic for this discussion. I'm not really interested in litigating our naming conventions here.

UnsafePointerToAtomic<Value> doesn't feel like a great name to me, but given that At Some Point it will be largely replaced by the move-only Atomic<Value>, I don't feel it's particularly important to press this point. (In fact, I rather like how it's a prefix that can be mechanically added to any new type that wants to model a similar construct.)

I pushed a commit that adds the UnsafePointerTo prefix.

lorentey · April 8, 2020, 1:26am

There is no real risk of confusion here, so I think this distinction is best explained in documentation and the list of requirements. I see nothing wrong with saying that Int is an "atomic type"; on the contrary, I'd like to encourage that.

AtomicPrimitive/PrimitiveAtomic would imply that conforming types are expected to implement requirements directly, which is not true in the case of custom RawRepresentable atomic types (and optionals).

I'm reserving the name Atomic for the eventual move-only generic struct, which is why I'm calling the protocol AtomicProtocol. If we allowed a bit of whimsy, we could also call it Atomicable.

Note though that there would be value in adding a separate PrimitiveAtomic protocol:

protocol PrimitiveAtomic where AtomicStorage == Self {}

Types conforming to this protocol must be directly atomicable, i.e., converting them to their atomic representation must consist of a regular copy (or move) operation. (This is a slightly stronger constraint than what's expressed by AtomicStorage == Self.)

I don't think I quite get what failure case you want to protect against here. The word "nullable" has no established meaning in the Swift Standard Library, and this protocol doesn't change how pointer types work in Swift. ("Null" in the context of Swift generally refers to the Unicode U+0000 NULL character, as in "null-terminated UTF-8 string".)

UnsafePointer is most definitely an atomic type; it allows us to perform atomic operations on its values.

The NullableAtomic protocol captures the notion of a type whose atomic storage representation supports at least one bit pattern that doesn't correspond to any of its values. This is a useful axis for categorizing atomic types, and I'd like to model it with a named protocol. (I really dislike the idea of replacing this concept with conformance acrobatics.) "Nullable" seemed like a good enough word for this to me.

The compiler likes to call a generalized version of this concept "types with extra inhabitants". Compiler jargon rarely translates well to API names, though.

If forced, I could live with OptionalAtomic, although this doesn't reflect that "optionalability" is a feature on top of regular atomicity. OptionableAtomic? NilRepresentableAtomic?

This is a great point! atomicStorage(for:) is not initializing a pointer, because that wouldn't fit into how ManagedBuffer (its primary user-level use case) works. So deinitializeAtomicStorage ought to work on the same abstraction level.

How about this?

protocol AtomicProtocol {
  /// Convert `value` to its atomic storage representation. Note that the act of
  /// conversion may have side effects (such as retaining a strong reference),
  /// so the returned value must be used to initialize exactly one atomic
  /// storage location.
  ///
  /// Each call to `setupAtomicStorage(for:)` must be paired with call to
  /// `disposeAtomicStorage(_:)` to undo these potential side effects before
  /// deinitializing storage.
  ///
  /// Between setup and disposal, the memory location must only be accessed
  /// through the atomic operations exposed by this type.
  static func setupAtomicStorage(for value: __owned Value) -> AtomicStorage

  /// Dispose of an atomic storage value, extracting and returning the
  /// value its represents.
  ///
  /// The provided value must contain a valid atomic storage representation for
  /// this type, generated by `setupAtomicStorage(for:)` and optionally
  /// transformed by a series of atomic operations provided by this type.
  static func disposeAtomicStorage(_ storage: __owned AtomicStorage) -> Self
  ...
}

To illustrate how these are supposed to be used, consider this toy example:

extension AtomicProtocol {
   mutating func withTemporaryAtomicValue<R>(
      _ body: (UnsafePointerToAtomic<Self>) throws -> R
   ) rethrows -> R {
      var storage = Self.setupAtomicStorage(for: self)
      defer {
         self = Self.disposeAtomicStorage(storage)
      }
      return try withUnsafeMutablePointer(to: &storage) { ptr in
        try body(UnsafePointerToAtomic<Self>(at: ptr))
      }
   }
}

var count = 0
count.withTemporaryAtomicValue { atomic in
  DispatchQueue.concurrentPerform(iterations: 100) { 
    for _ in 0 ..< 1_000_000 {
      atomic.wrappingIncrement(ordering: .relaxed)
    }
  }
}
print(count) // Prints 10000000

A more likely production use case would use ManagedBuffer:

struct AtomicCounter {
  class Buffer: ManagedBuffer<Int.AtomicStorage, Void> {
    deinit {
      withUnsafeMutablePointerToHeader { header in
        _ = Int.disposeAtomicStorage(header.pointee)
      }
    }
  }
  let buffer: Buffer

  init() {
    buffer = Buffer.create(minimumCapacity: 0) { _ in
      Int.setupAtomicStorage(for: 0)
    } as! Buffer
  }

  private func _withAtomicPointer<R>(
    _ body: (UnsafeAtomic<Int>) throws -> R
  ) rethrows -> R {
    try buffer.withUnsafeMutablePointerToHeader { header in
      try body(UnsafeAtomic<Int>(at: header))
    }
  }

  func increment() {
    _withAtomicPointer { $0.wrappingIncrement(ordering: .relaxed) }
  }
  func load() -> Int {
    _withAtomicPointer { $0.load(ordering: .relaxed) }
  }
}

Note that Int is a primitive atomic type, so the setup/dispose calls above can be safely replaced by regular copy operations. This can and probably should be made explicit by the PrimitiveAtomic protocol above.

(setupAtomicStorage(for:)/disposeAtomicStorage(_:) are only meaningful for custom atomic types and optionals right now, so adding them may look like an unnecessary complication. However, these conversion methods will be critical when we get to implementing atomic strong references, whose conversion will likely involve new kinds of reference counting operations.)

xwu · April 8, 2020, 1:31am

Nit: it'd be setUp (verb) as opposed to setup (noun)--but that sounds fine. To me at least, the oppose action of "set up" is "tear down."

lorentey · April 8, 2020, 1:35am

Oh, that's unfortunate. How about prepareAtomicStorage(for:) / disposeAtomicStorage()?

xwu · April 8, 2020, 1:42am

Why unfortunate?

lorentey · April 8, 2020, 1:52am

xUnit's setUp/tearDown methods have kind of crystallized these names to a very particular meaning.

Joe_Groff · April 8, 2020, 4:56am

If you don’t like the “conformance acrobatics”, there’s also the fact that this is a fundamental layout property of types that the compiler and runtime already know about every type. It’s wasteful at best to reencode it as a protocol, so maybe what we want is to expose a layout constraint akin to AnyObject for types with optimized null representations.

lorentey · April 8, 2020, 8:42am

That sounds interesting! However:

[Layering] I don't think Optional is the best place to issue atomic operations. We'd need to also add a way to retrieve the right inhabitant as a compile-time constant. And the implementation of the actual atomic operations seems a lot messier -- even if we artificially restrict this to types that can use a builtin Word as their raw representation.

extension Optional: AtomicProtocol 
where Wrapped: AtomicProtocol & 
               _HasOptimizedNullRepresentation & 
               _HasSingleWordCompatibleLayout {
  func atomicLoad(at ptr: UMP<Self>, ordering: AtomicLoadOrdering) -> Self {
    let value = Builtin.atomic_load_Word(ptr)
    if value == _optimizedNullRepresentation(for: Wrapped.self) { // ehh
      return nil
    }
    return Wrapped(_fromBuiltinWordValue: Int(value))! // ugh
  }
}

[Flexibility] How will this support Atomic<Optional<T>> where T: AnyObject? I expect atomic strong references will come with a custom double-wide atomic storage type. Will we be forced to model them as a standalone type, like lazy references?

I believe the current design handles this just fine*; we can simply add default implementations on NullableAtomic where Value: AnyObject.
```
class ListNode<Payload>: NullableAtomic { 
  var next: Atomic<ListNode?> = nil // note: overly optimistic syntax
  let payload: Payload
}
```
(*: This is a forward-looking statement. I'm particularly worried about implementing orderings -- it may well be that strong references won't implement the full AtomicProtocol, and will need to come with their own atomic wrapper.)
[Discoverability] I'm worried we'd lose the documentation benefits of having a protocol with searchable conformances. How will people figure out what types are optional-atomicable? In the current design, I expect NullableAtomic will naturally call attention to itself when people look at the AtomicProtocol hierarchy. The doc page for NullableAtomic will include a nice autogenerated list of all standard conforming types.
[Over-generalization] The set of standard "nullable" atomic [generic] types is finite and well known.
[Customizability] There is some value in allowing people to manually implement NullableAtomic for their custom types, by forwarding* operations to some other atomic type. (And, as seen above, we want to allow people to easily enable atomics support for their own atomic strong reference types.)

(*: Assuming we eventually get around to implementing constexpr-constrained args as a public language feature.)
[KISS] This problem doesn't require a fancy solution. Why complicate things?
[Timeliness] I would obviously be unhappy about postponing this feature until we get around to enriching the type system.

We have ~90 protocols in the core stdlib; this one wouldn't be the best, but I don't see how it would be a particularly bad one, either. Especially in a separate module...

If this is going to block progress, I'll of course remove NullableAtomic along with support for optional pointers/references. Users will still be able to limp along by manually modeling optionals with guard values. This wouldn't be great, but it's still vastly preferable to not having anything at all.