SE-0282: Low-Level Atomic Operations

Chris_Lattner3 · April 15, 2020, 6:13pm

Right, but the proposal doesn't support that. This is why the static members are important, as mentioned in the pitch phase. Once those are in place, the actual UnsafeAtomic interface as proposed is just unconventional sugar on top of them.

-Chris

Joe_Groff · April 15, 2020, 6:18pm

It isn't clear to me from that linked post what static methods would enable that isn't in the proposal already. Could you give an example?

Chris_Lattner3 · April 15, 2020, 6:27pm

The proposal is a wrapper around an UnsafePointer which provides instances methods that pass the pointer into static methods on the AtomicProtocol type. This has several issues, not least of which that it disables the & sugar we have for pointer promotions.

I agree that it is possible to use the proposal to achieve the equivalent of static members, but it is not better in any way to use this if you don't want to have a materialized UnsafePointer sitting around.

For example, try comparing the AtomicCounter class as shown in the proposal with the AtomicCounter in the post I linked above. The one I linked to is more efficient by avoiding a needless indirection, allocation, etc. Please try writing the equivalently efficient AtomicCounter class using the proposal - it is strictly worse.

I think it is important to underscore why this matters so much to me: atomics are all about efficiency (otherwise you'd use simpler and safer things like mutexes). However, atomics being fast completely depends on cache locality and avoiding false sharing. The design of UnsafeAtomic pushed aggressively towards separately malloced atomic values, which undermines a lot of the value of them in the first place.

Joe_Groff · April 15, 2020, 6:40pm

I see. Here's the code in question:

// warning: invalid code! do not c&p
class AtomicCounter {
  private var _value = 0

  init() {
    UnsafeAtomic<Int>.initialize(address: &_value, to: 0)
  }

  deinit {
    _value.destroy(address: &_value)
  }

  func increment() {
    UnsafeAtomic<Int>.wrappingIncrement(address: &_value, by: 1, ordering: .relaxed)
  }

  func get() -> Int {
    UnsafeAtomic<Int>.load(address: &_value, ordering: .relaxed)
  }
}

The fundamental language problem, independent of this proposal, is that it isn't well-defined to take the address of an ivar as a free-floating pointer, because ivars are expected to be governed by the exclusivity model and only be read or written by well-behaved formal accesses. &_value is an inout access, even when it undergoes a pointer conversion, and so exerts exclusivity on the storage, so it can't be used for atomic storage where you intentionally want to accommodate simultaneous accesses. The fact that the UnsafeAtomic types don't work with the & sugar is a feature, not a bug. If you want to avoid extra indirection in a class instance in the language as it exists today, you would have to use ManagedBuffer's tail allocation to store your atomic variables; you could then provide an UnsafeAtomic property that converts the pointer to the ManagedBuffer's tail allocation. Static vs. instance methods for UnsafeAtomic's interface therefore do not fundamentally improve the potential efficiency of code using atomics today, because atomics must live in non-property raw memory.

Providing freely-addressable raw storage inside instances is a separate problem we need to solve. Once we have that, then you would be able to form an UnsafeAtomic from a pointer to a raw storage ivar and avoid indirection that way:

class AtomicCounter {
  // Strawman syntax for a "raw" variable with stable pointers
  @raw private var _counterStorage: Int = 0

  private var counter: UnsafeAtomic<Int> {
    // Strawman raw properties would not be subject to our usual constraints
    // about taking their address
    return UnsafeAtomic(&_counterStorage)
  }

  func increment() { counter.wrappingIncrement(by: 1, ordering: .relaxed) }

  func get() -> Int { return counter.load(ordering: .relaxed) }
}

In previous discussions about this, it seemed promising that a property wrapper could be used to abstract away the underlying raw storage so you only have to declare the surface property of UnsafeAtomic type.

Chris_Lattner3 · April 15, 2020, 7:05pm

Right, I'm aware of the issue with stored properties, it was just a very simple example to illustrate the point and I expect the issue to be solved at some point in the future. There are lots of ways in Swift-today to get nailed down addresses in memory, including mallocing slabs of memory and overlaying structure types on them in various ways.

Do you have a specific concern with this reimagining this thing as an AtomicReference class? That would be safer and provide the same functionality for the "malloc a pointer" case which the proposal seems to be prioritizing.

-Chris

Michael_Gottesman · April 15, 2020, 7:05pm

I would recommend using an initializer instead of a create method, because Swift consistently uses initializers for these things. Instead of: let atomic = UnsafeAtomic<Value>.create(initialValue: 0) , it would be better to have: let atomic = UnsafeAtomic<Value>(createInitialValue: 0) or whatever.

I don't want to give a full review here, but I just want to say that I agree with Chris here. Using static methods here feels very un-Swifty to me. If we can use inits instead of static factory methods we 100% should! That being said, not going to comment further.

Joe_Groff · April 15, 2020, 7:24pm

I don't see this proposal as primarily being about prioritizing "malloc a pointer"; maybe the convenience APIs that do the combined malloc-and-initialize and destroy-and-free are a distraction. As I see it, the core value of the proposed API is in giving you atomic operations once you have a pointer to some appropriately-sized and -aligned raw memory to operate on. Malloc happens to be one of the few reliable ways to do that today, but we should fix that! Having a safe AtomicReference class would be great, but having it be the lowest level of functionality the language provides would be baking in a lot more object infrastructure than is strictly necessary. A class wouldn't be as easy to apply to raw properties in classes, moveonly structs, explicit stack allocations, or other mechanisms we may provide for allocating raw memory in the future.

glessard · April 15, 2020, 8:38pm

I think the pattern as proposed works well in analogy to Unsafe*Pointer.allocate().

That does look like a better name to me.

salutis · April 16, 2020, 12:10am

When a type conforms to Equatable, it is ... well, equatable. When it conforms to AtomicProtocol, it is ... what? I think we should find a better name. (The name does not need to be short which makes the task easier.)

lorentey · April 16, 2020, 3:54am

There is no technical reason against an atomic Bool -- I think we should definitely provide it. Like double-wide atomics and Float/Double, I'd like to defer this for a future proposal.

toggle() roughly corresponds to the atomic XOR operation on integers, and I expect it will see a similar amount of use in practice. I don't really see any reason against adding such methods -- in the case of Bool, they won't even implement any protocol requirements, so they are really rather cheap.

I think we only want Atomic<Bool> for now. There will be plenty of room to add AtomicFlag if and when the need arises.

Classes would be viable replacements for the create/destroy methods. I wouldn't really consider this to be a significant design shift -- it's merely adding an additional set of precursor types on top of the existing UnsafeAtomic and UnsafeAtomicLazyReference.

These classes will get functionally obsoleted with the advent of Atomic<Value>, but that's largely* true for UnsafeAtomic, too. I don't think this is a problem; library additions are cheap, and unlike language features, we have a clear (and proven) process to deprecate things we want to replace/supersede later. (As long as we reserve the good names for the eventual proper constructs, of course.)

(*: I think it's likely that move-only atomics will continue to use the unsafe variants within their implementation, to gain access to the address of their storage ivar.)

If we feel we must have a sigil for referenciness, then I'd much rather go with something that leads to less awkward names than AtomicLazyReferenceReference.

How about this?

class ManagedAtomic<Value: AtomicValue> { ... }
class ManagedAtomicLazyReference<Instance: AnyObject> { ... }

The lowest-level static methods aren't provided as public API, and I see no reason to change this. Builders of high-performance libraries will use the UnsafeAtomic constructs that are tailored specifically for that use case.

Exposing the atomic storage types and associated pointer-based operations would encourage mixed atomic/nonatomic use, encourage potentially non-portable assumptions about the storage representation, and encourage the use of invalid inout-to-pointer conversions. It also seems far more complicated to operate on this level than UnsafeAtomic, especially when it comes to integrating such APIs with an eventual design for addressable ivars.

I don't think there is any reason to add low-level atomics the default namespace of every Swift program.

AtomicValue would be a great name for the protocol! Let's switch to that.

(This is another example where losing a ceremonial suffix leads to a (to me) obviously better name.)

UnsafeAtomicLazyReference is the single Swift-level atomic abstraction we already have years of experience using within the Standard Library. (Not in this nice form, of course.) It also seems clearly useful outside of the stdlib, for the same sort of use cases. We do deeply understand how it works, as well as the limitations/gotchas it comes with. It is not trivial to reimplement it from scratch, and its use case (memoizing costly calculations inside high-performance value types) feels frequent enough to deserve a standard solution.

It's true that UnsafeAtomicLazyReference isn't using dedicated llvm intrinsics, but neither do atomic pointers and atomic unmanaged references. I don't find this a convincing reason to get rid of any of these constructs.

We could of course subset out lazy references, but I find that the extra type is an important reminder that Atomic<Value> won't be able to cover all atomic constructs. Having more than one concrete example for these is particularly helpful when weighing alternative naming ideas for these precursor types. (I would hate it if we would paint ourselves into a regrettable naming corner in case we decided to e.g. model some double-wide atomic constructs (such as versioned pointers) with standalone types.)

So, even if we want to move atomic lazy references to a followup proposal, I suggest to keep them around until we get to a naming scheme that works for them.

UnsafeAtomic was specifically designed to allow building efficient atomic constructs in the language we have today. Its operations are not in any way less efficient than "direct" pointer-based API -- both compile to the same code.

Unfortunately, atomicLoadWord(at: &ivar, ordering: .relaxed) simply wouldn't work -- at least not without fundamentally changing the meaning of the inout-to-pointer conversion. Like Joe says, the fact that UnsafeAtomic discourages such use is an important feature of this proposal.

With UnsafeAtomic and UnsafeAtomicLazyReference, I am deliberately establishing an API pattern that is rigid enough that we'll be able to directly plug it into a new language feature for addressable ivars.

If we do decide to add atomic classes, then they will obviate the need for create and destroy, so we can simply remove them from the proposal.

(create/destroy is modeled after allocate/deallocate, but those wouldn't be good names for them, since create/destroy also cover initialization and deinitialization. Getting rid of them is a nice way to eliminate this problem.)

lorentey · April 16, 2020, 4:15am

Note that like the eventual move-only atomics, these class types would simply forward operations to the corresponding unsafe struct.

public class ManagedAtomic<Value: AtomicValue> {
  internal typealias _Storage = UnsafeAtomic<Value>.Storage
  internal var _storage: _Storage

  public init(_ initialValue: Value) {
    _storage = _Storage(initialValue)
  }
  deinit {
    _storage.dispose()
  }
  internal func _withUnsafeAtomic<R>(
    _ body: (UnsafeAtomic<Value>) throws -> R
  ) rethrows -> R {
    // Note: this is *not* done through an inout-to-pointer conversion.
    let p = _getUnsafePointerToStoredProperties(self)
      .assumingMemoryBound(to: _Storage.self)
    try body(UnsafeAtomic<Value>(at: p))
  }
  public func load(ordering: AtomicLoadOrdering) -> Value {
    _withUnsafeAtomic { $0.load(ordering: ordering) }
  }
  // ...
}

I hope this will (soon) be generalized to a language feature to access the address of specially annotated ivars. One idea is to make this look like (or compose with) property wrappers:

public class ManagedAtomic<Value: AtomicValue> {
  @pinned var _value: UnsafeAtomic<Value>

  init(_ initialValue: Value) {
    $_value = .init(initialValue)
  }
  deinit {
    $_value.dispose()
  }
  func load(ordering: AtomicLoadOrdering) -> Value {
    _value.load(ordering: ordering)
  }
  // ...
}

Then we'd eventually replace ManagedAtomic with a move-only type with the exact same implementation:

moveonly struct Atomic<Value: AtomicValue> {
  @pinned var _value: UnsafeAtomic<Value>

  init(_ initialValue: Value) {
    $_value = .init(initialValue)
  }
  deinit {
    $_value.dispose()
  }
  func load(ordering: AtomicLoadOrdering) -> Value {
    _value.load(ordering: ordering)
  }
  // ...
}

kiel · April 16, 2020, 4:44am

I'm never needing such low level constructs and defer to the experienced or the experts.

I appreciate the drive to make the APIs as consistent in style with other Swift APIs because if I ever need to dive down to this level of detail, the API will not be a stumbling block to the learning curve I'd have to climb to learn the concepts these APIs implement. This also improves its teachability. I appreciate the lengths taken to design types that guide people to do things properly. I enjoyed the details in the proposal. Well done, @lorentey!

I'd like to say that the pitch and review was done really well -- this feature was an excellent example of the evolution process working.

gribozavr · April 16, 2020, 4:03pm

I would very much like to have atomics and other features related to concurrency and parallelism available in Swift. However, I can't fully support the proposal in the current form -- see feedback below.

Without a doubt, it is. We are building Swift as a language that can fit many domains including low-level programming, OS kernels, high-performance software.

See my feedback below where I point out how this proposal goes against the current Swift API design patterns.

I have used atomics in C, C++, and Java. The proposal provides low-level, unsafe functionality, so it feels much closer to C atomics than C++ or Java atomics. That is not a critique -- we need both unsafe, low-level atomic operations for library builders, as well as high-level safe functionality. We have to start somewhere, and this proposal would be the first step in developing this area.

I heavily contributed to the discussion, and re-read the final version proposal again before writing this review.

Design Feedback

(Some of the feedback that I wanted to provide is already posted to this review thread by @Chris_Lattner3, so I'm only posting new information and not repeating the points that Chris already brought up.)

The name of `UnsafeAtomic` and its intended role

Throughout the review, I have been advocating for the name UnsafePointerToAtomic instead of UnsafeAtomic, emphasizing that we have a pointer to a value that is accessed atomically. The response has been:

These new generic types emphatically aren't pointers -- they merely happen to contain a pointer value in their internal representation. It's far more instructive to think of these types as unsafe precursors to corresponding non-copiable constructs, allowing us to fully define and start using the functionality they will eventually provide even before non-copiable types become available in the language.
(from the " Alternative Names for UnsafeAtomic Types" section in the proposal)

I think that the disagreement over the name is caused by the disagreement over the intended role of the UnsafeAtomic type. We did not discuss the intended role of the type during the discussion phase of the proposal -- we discussed naming and API without agreeing first on what the type should represent. I think it is critical to get a firm agreement on that first. Let me try to explore some alternatives here.

One option is that UnsafeAtomic<T> is an unsafe variant of the safe Atomic<T> type that we can't implement today because move-only types are not available. The unsafety comes from the need to allocate/deallocate memory manually, otherwise UnsafeAtomic<T> is identical to Atomic<T>. As far as I understand, this option is supported by the proposal.

Another option is that UnsafeAtomic<T> is merely a API performs atomic operations on a memory location. (This is my preference.)

Why do I think that it is better to present UnsafeAtomic<T> as only a way to perform atomic operations and nothing else?

As the proposal says, it focuses on providing low-level atomic primitives. Therefore, we should focus on providing the primitives instead of trying to imitate the future safe API. So what is the right primitive? Modern hardware models atomics as a property of memory access, not as a property of a memory location. We have no indication that future hardware will use a different model (for example, a model where atomicity is based on memory locations). Therefore, the primitive is a memory access, and in our Swift atomics API we should provide a way to perform arbitrary atomic operations on arbitrary memory locations.

UnsafeAtomic<T> can't promise easy migration to safe atomics like the proposal promises. [1] Why not? UnsafeAtomic<T> has maybe-owned semantics: the pointee can be either allocated with .create and destroyed through .destroy, or UnsafeAtomic<T> can be constructed to point to any existing memory location. In my opinion, such maybe-owned semantics can't be easily migrated to strong ownership without sophisticated static analysis.

[1] "We expect code using these unsafe precursors will be easily upgradeable to their eventual non-copiable variants when it becomes possible to implement those."

In future (but not now), we should allow mixing atomic and non-atomic accesses, therefore we need an API for that. UnsafePointerToAtomic<T> could easily play that role; for UnsafeAtomic<T> it would be weird because it tries to own the pointed-to memory.

Furthermore, we already have an established word in the standard library API design that means "a maybe-owned reference" -- the word is "Pointer", as in UnsafeMutablePointer.

I'm not inherently opposed to providing an unsafe emulation of a safe atomic type that we will have in future, however, I think it is outside of the scope of the low-level atomics proposal.

Deferring double-width atomics to a future proposal

I am concerned that the proposal is not tackling double-width atomics, at least with a sketch. My concern is that some aspects of the API we are looking at right now will not work for double-width values, or would be awkward to use. I would like this point to be explored in more depth to have more confidence in the low-level API being the right one. It could be also argued that double-width atomics are primitive low-level operations and therefore should be included in this proposal.

For example, the proposal could say that there is an expectation that the standard library adds an Int128 type that would be the storage type for double-width atomics; or that there is an expectation that tuples of types that conform to AtomicProtocol also conform to AtomicProtocol -- or something else. Maybe we should implement those enhancements before this proposal is accepted, if they are as easy as adding an Int128 type.

Insufficient community discussion of `UnsafeAtomicLazyReference<Instance>`

I feel like there was not a sufficient amount of discussion about the role, use cases, and the API of UnsafeAtomicLazyReference. I suggest to hold a separate focused discussion -- maybe as a separate forum thread. I think UnsafeAtomicLazyReference could still stay in the same proposal, because it justifies the need to have a separate storage type in atomics.

Inconsistent naming of `UnsafeAtomicLazyReference.storeIfNil`

UnsafeAtomicLazyReference.storeIfNil is sometimes called storeIfNilThenLoad in the proposal text. Since others have not noticed this inconsistency so far, I think it is another signal that reviewers are not paying enough attention to this type.

Atomic memory ordering types are structs instead of enums

The proposal says that atomic memory ordering types (for example, AtomicLoadOrdering) are structs instead of enums for performance reasons. Structs can be declared frozen to fix the layout and triviality, but there is no equivalent for enums that does not also freeze the set of enumerators.

Using structs here is not a huge loss, but it makes API express the intent less clearly than it could if it used an enum. During the discussion I suggested that we could extend the Swift language, possibly with a stdlib-only attribute for now, that would add the necessary capability, but I think my suggestion was lost:

More guidance about usage of `AtomicProtocol` in user code

I think we need to provide more guidance/warnings/linting about using AtomicProtocol is user code. The proposal already says that the requirements are private implementation details. It also says that it is OK to implement AtomicProtocol for RawRepresentable types. Other usages are in a grey area. For example, is it OK to use AtomicProtocol in user-written generic constraints? I think the answer should be no, since that would allow using UnsafeAtomic<T> from a generic context with a dependent type, preventing the switch statements over memory ordering from being folded.

Composability of `UnsafeAtomic<T>` with generics

I agree that a large number of users' needs would be satisfied by allowing to use UnsafeAtomic<T> with a statically-specified type. But it seems to me that some users would like to use UnsafeAtomic<T> inside generics with a dependent type. That is currently prohibited due to the need to fold the switch statements over memory ordering at compile time.

For example, I would like to implement something like this -- an atomic counter with a user-configurable underlying integer type that allows the user to choose the right integer size for their problem:

struct AtomicCounter<IntegerTy: AtomicInteger> {
  private var _value: UnsafeAtomic<IntegerTy>
  func increment()
  func decrement()
  var currentValue: IntegerTy { get }
}

One could argue that it is possible to implement this counter with a few wrapper types, so I personally would be OK if we decided that we won't support using UnsafeAtomic in a generic context. However, I didn't see composability with generics being discussed sufficiently while the proposal was drafted.

`UnsafeAtomic<Bool>`

Should UnsafeAtomic<Bool> work? Sounds like an unintentional omission; but it could be also considered non-primitive and proposed later, just like floating-point atomics are deferred.

saeta · April 16, 2020, 4:12pm

One key use-case for atomics as a work-stealing queue (aka thread pool). A good design would use a per-thread, fixed-sized deque, which I would probably write in Swift as a ManagedBuffer with tail-allocated elements. Within each element, you need an atomic state variable to determine whether there's a work item in that slot or not, and if it is being executed by another thread and/or has completed.

In the proposed design (without static methods), I now need to store 2 types, one is the storage for the state itself, and secondly a pointer to the adjacent state:

// Do not use; just for illustrative purposes!
private struct MyObject {
  var stateStorage: UnsafeAtomic<MyWorkItemState>.Storage
  var atomicState: UnsafeAtomic<MyWorkItemState> // Points to `stateStorage`
  var closureOfWorkItem: () -> Void
}

enum MyWorkItemState { ... }

If static methods were exposed, then I can avoid allocating any memory for the atomicState and could instead directly operate on stateStorage. (Given that I know I've pinned this object in memory (through use of ManagedBuffer), this should be allowed, supported, and safe.)

Additionally, to get a really good implementation, I do want the ability to do a non-atomic (tear-able) read of stateStorage before doing a CAS operation.

I hope that this provides clearer motivation for these advanced features (i.e. static methods, and tear-able reads). (In particular, this is an advanced, unsafe abstraction; please do make it powerful enough to actually fulfill the advanced, performance-sensitive needs!*)

*Atomics are all about low-level performance. If the API is designed such that it does not admit a zero-cost abstraction, then users will likely continue to use C's atomics instead (which undermines the value of the language feature itself).

Thank you very much for the hard work on this; I really appreciate it!

Joe_Groff · April 16, 2020, 4:15pm

saeta:

In the proposed design (without static methods), I now need to store 2 types, one is the storage for the state itself, and secondly a pointer to the adjacent state:
// Do not use; just for illustrative purposes!
private struct MyObject {
  var stateStorage: UnsafeAtomic<MyWorkItemState>.Storage
  var atomicState: UnsafeAtomic<MyWorkItemState> // Points to `stateStorage`
  var closureOfWorkItem: () -> Void
}

enum MyWorkItemState { ... }
If static methods were exposed, then I can avoid allocating any memory for the atomicState and could instead directly operate on stateStorage . (Given that I know I've pinned this object in memory (through use of ManagedBuffer ), this should be allowed, supported, and safe.)

You don't need to store the UnsafeAtomic pointer separately (and in fact, it wouldn't make sense for a struct to try to hold a pointer to itself; structs don't have addresses!) If you're using a ManagedBuffer, you can provide a computed property that constructs the UnsafeAtomic from the pointer to the tail allocation and operate directly on stateStorage that way.

saeta · April 16, 2020, 4:17pm

Will that be guaranteed to be zero cost? If so, that should be included in the proposal (and sounds great to me). If not, then we need an alternate solution, of which static methods are one possibility.

Joe_Groff · April 16, 2020, 4:18pm

Yes, it would be the same as any other bit-casting operation.

saeta · April 16, 2020, 4:55pm

@gribozavr and I had a brief side conversation and he said something that I thought was pretty insightful (reproduced with permission):

Joe_Groff · April 16, 2020, 5:00pm

I agree, it's definitely a pointer in nature.

lorentey · April 16, 2020, 5:27pm

I'm kind of hurt I have to spell this out, but to be clear, there was nothing unintentional about the omission of UnsafeAtomic<Bool>. This proposal is already by far the longest ever, and extra sections on bool/float operations would make it even worse. These are sideshows we can discuss at our leisure; we know they won't introduce any complications. (Unlike Optional, which we had to tackle during the pitch phase.)

SE-0282: Low-Level Atomic Operations

Design Feedback

The name of UnsafeAtomic and its intended role

Deferring double-width atomics to a future proposal

Insufficient community discussion of UnsafeAtomicLazyReference<Instance>

Inconsistent naming of UnsafeAtomicLazyReference.storeIfNil