Atomics

Alejandro · September 18, 2023, 8:46pm

Hi all,

I'd like to finally pitch the 3rd revision to @lorentey 's initial Low-Level Atomic Operations after a few years. Since this was initially proposed, reviewed, revised, and another form was later accepted, the swift-atomics package has been making strides in designing and developing what an atomic API would look like for the standard library. Until very recently, we've had to resort to UnsafeAtomic and ManagedAtomic which are both not the final API that we really wanted to propose. With the introduction of non-copyable structs and enums (and a little compiler magic), we can finally propose the memory safe Atomic<Value> type we've wanted for years. We think this is a great time to propose these features to the Swift standard library.

Low-Level Atomic Operations

Authors: Karoy Lorentey, Alejandro Alonso

gist.github.com

https://gist.github.com/Azoy/89167e0ab3990ab1ed22d2aa8e628295

atomics.md

# Low-Level Atomic Operations ⚛︎

* Proposal: [SE-NNNN](NNNN-atomics.md)
* Author: [Karoy Lorentey](https://github.com/lorentey), [Alejandro Alonso](https://github.com/Azoy)
* Review Manager: TBD
* Bug: [SR-9144](https://github.com/apple/swift/issues/51640)
* Implementation: https://github.com/apple/swift/pull/68857
* Version: 2023-09-18
* Status: **Awaiting review**

This file has been truncated. show original

Please give this a read and let us know what you think!

ksluder · September 18, 2023, 10:01pm

Can you clarify whether Atomic<Value> stores its value inline? Storing it out-of-line would introduce cache coherency problems which don’t exist in C/C++. But I’m hoping that move-only types were the missing piece enabling inline storage.

Also, atomicMemoryFence seems like a misnomer to me, because unless the relaxed ordering is used, a fence affects all of a thread’s reads and writes. This goes beyond guaranteeing atomicity of a single value. Shortening the name to memoryFence would avoid over-specificity.

Alejandro · September 18, 2023, 10:44pm

Yes!

While atomicMemoryFence itself isn't an atomic operation, it does synchronize memory access and requires an additional atomic operation to set up. I do feel naming this function to something familiar (std::atomic_thread_fence) would be benefit those adopting atomics in Swift. Although to be fair, Rust does just call this fence (fence in std::sync::atomic - Rust) given that not a lot of terminology uses the word fence. We probably would use memoryFence as you mentioned instead of fence though if others think the atomic prefix isn't necessary for this operation.

tclementdev · September 18, 2023, 11:03pm

Should these be let?

private var _last = Atomic<NodePtr?>(nil)
private var _consumerCount = Atomic<Int>(0)

Alejandro · September 18, 2023, 11:07pm

Yep! I've gone ahead and fixed it, thanks!

rvsrvs · September 18, 2023, 11:20pm

Is the current PR in swift-atomics something that could be used to try the Atomic<Value> type? I'd love to start replacing my uses of ManagedAtomic with that type if that is what is going to move to the stdlib.

Alejandro · September 18, 2023, 11:29pm

Yes, that PR could be used to try this new type out. Keep in mind that that PR doesn't have the protocol hierarchy changes, so using it generically might be a bit more awkward than what's proposed, but simple Atomic<Int> operations and such should all be the same.

filip-sakel · September 19, 2023, 12:13am

Would it make sense to create custom diagnostics when a value of Atomic is stored in a variable binding? Because using a variable binding looks like a very minor change that may needlessly result in undefined behavior.

Joe_Groff · September 19, 2023, 12:35am

It would make sense to provide a diagnostic IMO, though in many cases, the existing "foo was declared var but was never mutated; consider making it a let" diagnostic would kick in already (albeit with language that is maybe not entirely accurate when talking about self-synchronizing types like atomics). To be clear, it isn't undefined behavior to put an atomic into a var, only unnecessarily hazardous at runtime when dynamic exclusivity enforcement is afoot. In cases where exclusivity can be dynamically enforced, it may be useful for Atomic to provide APIs that take advantage of it, such as safely allowing nonatomic accesses when we know we have exclusivity. It still wouldn't ever make sense to use those APIs in conjunction with dynamic exclusivity enforcement if we had them, though.

lorentey · September 19, 2023, 12:46am

We originally considered providing this as an extension on AtomicUpdateOrdering:

extension AtomicUpdateOrdering {
  func memoryFence()
}

However, this would've forced use sites to spell out the type name:

AtomicUpdateOrdering.acquiring.memoryFence()

This looks rather bad -- hence the top-level function.

The proposed name matches the one used in the swift-atomics package; changing it needs to be weighed against the (small, but not nonexistent) client-side pain of migrating existing usages.

The prefix "atomic" does carry weight here: atomic fences synchronize with specific atomic operations and other atomic fences, so they have an "atomic flavor". (The prefix helps distinguish them from strictly compiler-level fences (like signal fences) and non-portable, low-level memory barriers, neither of which are modeled here.)

Joe_Groff · September 19, 2023, 12:48am

It's great to see this all finally coming together!

One minor thing I wonder about is the nesting of the primitive atomic representations under the various integer types:

extension Int.AtomicRepresentation: AtomicStorage {...}
extension Int64.AtomicRepresentation: AtomicStorage {...}
extension Int32.AtomicRepresentation: AtomicStorage {...}
extension Int16.AtomicRepresentation: AtomicStorage {...}
extension Int8.AtomicRepresentation: AtomicStorage {...}
extension UInt.AtomicRepresentation: AtomicStorage {...}
extension UInt64.AtomicRepresentation: AtomicStorage {...}
extension UInt32.AtomicRepresentation: AtomicStorage {...}
extension UInt16.AtomicRepresentation: AtomicStorage {...}
extension UInt8.AtomicRepresentation: AtomicStorage {...}

extension DoubleWord.AtomicRepresentation: AtomicStorage {...}

I wonder if these should instead be independently-declared types whose only purpose is to represent a particular atomic access size:

struct AtomicStorage8: AtomicStorage { ... }
struct AtomicStorage16: AtomicStorage { ... }
...
struct AtomicStorage128: AtomicStorage { ... }

The various *Int* types could still have typealias AtomicRepresentation = AtomicStorageMM members to portably represent the proper atomic storage to use for a value of that size on the current host platform. If we had a more fine-grained #if canImport(struct Synchronization.AtomicStorage128) compile-time conditional, that could also provide a way for code that needs to reproduce specific in-memory representations out of atomics to ensure that the appropriate atomic sizes are available.

Karl · September 19, 2023, 1:04am

Very excited to see this!

That said, it seems highly undesirable to add low-level atomics to the default namespace of every Swift program, so we propose to place the atomic constructs in a new Standard Library module called Synchronization. Code that needs to use low-level atomics will need to explicitly import the new module:
import Synchronization

What does this mean for the implementation of the "core" standard library (i.e the stuff in stdlib/public/core)?

If we wanted to ship a feature that used an atomic value, would we have to duplicate the library type in core or even fall back to C?

lorentey · September 19, 2023, 1:45am

Joe_Groff:

I wonder if these should instead be independently-declared types whose only purpose is to represent a particular atomic access size:
struct AtomicStorage8: AtomicStorage { ... }
struct AtomicStorage16: AtomicStorage { ... }
...
struct AtomicStorage128: AtomicStorage { ... }

We generally prefer to avoid polluting clients' namespaces with top-level typenames for auxiliary types. We have struct Dictionary.Index, not struct DictionaryIndex; we have struct String.Iterator, not struct StringIterator.

On the other hand, Int8, UInt8 and Bool ought to all share the same atomic representation (i.e., Int8.AtomicRepresentation == UInt8.AtomicRepresentation should hold true), and introducing a top-level AtomicStorage8 would let us have a nice dedicated name for it, instead having to define it under one of these types.

An interesting analog is Duration: it is a top-level name for an associated type that is shared across multiple Clock types. Superficially, this seems like it would make a good argument for defining top-level storage types.

One important difference is that unlike Duration, the atomic storage types will not be truly standalone things: each storage will be tightly coupled to one particular stdlib type. In the current pitch the storage types provide no public members, but I think this is overly restrictive: to allow clients to create custom AtomicValue conformances, we'll want to provide public ways to convert values to/from atomic storage values:

@_alignment(1)
@frozen public struct AtomicStorage8: AtomicStorage {
  public init(_ value: UInt8) { ... }
  public consuming func dispose() -> UInt8 { ... }
  ...
}

(The API details aren't interesting. Perhaps we'll prefer to have public static func encode/decode methods to align closer to AtomicValue.)

Note how we'll need to select a specific representative type to convert from/to -- it's highly unlikely we'd want these to be generic member functions (and it's unclear what constraints they would use anyway).

Given that we'll have such tight coupling between, say, the 8-bit atomic representation and UInt8, I think it makes sense to nest the storage under the UInt8 type rather than defining it a standalone top-level thing.

Int8 and Bool would semantically reuse UInt8's atomic representation rather than defining their own, so it makes sense that their AtomicRepresentations would be typealiases pointing to UInt8.AtomicRepresentation.

Is there a particular reason such a hypothetical #if canImport(struct Synchronization.AtomicStorage128) syntax couldn't also support testing for nested types, as in #if canImport(struct DoubleWord.AtomicRepresentation)?

ellie20 · September 19, 2023, 2:10am

Since Atomic stores its value inline, I wonder what implications that has for the semantics of borrowing parameters. Passing an Atomic directly into a borrowing parameter would require us to commit to borrowing parameters semantically having a stable location in memory. I might be completely wrong, but I don't think even inout parameters currently have this guarantee.

As far as I know, Swift currently lacks "internal mutability", so it has the interesting property that any borrowed value can be represented by the value's bytes without any indirection, as long as the function doesn't "act like" it owns the value. For example, passing an object value as borrowed would (as far as I know) pass a pointer to the object storage, not a pointer to a pointer to the object storage.

Joe_Groff · September 19, 2023, 2:13am

The association with the integer types doesn't seem quite as strong to me here as your other examples, since the atomic representation for UInt8 is usually going to be the atomic representation for all eight-bit trivial types, not only UInt8. It also seems to me that the proposal introduces the DoubleWord type primarily as a place to hang DoubleWord.AtomicRepresentation, and it wouldn't need to exist otherwise if you could refer to AtomicStorage64/128 independent of a type (though I could see wanting to have a portable AtomicStorageDoubleWord type to refer to the current target's double-word atomic storage type portably).

lorentey:

One important difference is that unlike Duration, the atomic storage types will not be truly standalone things: each storage will be tightly coupled to one particular stdlib type. In the current pitch the storage types provide no public members, but I think this is overly restrictive: to allow clients to create custom AtomicValue conformances, we'll want to provide public ways to convert values to/from atomic storage values:
@_alignment(1)
@frozen public struct AtomicStorage8: AtomicStorage {
  public init(_ value: UInt8) { ... }
  public consuming func dispose() -> UInt8 { ... }
  ...
}

I could imagine us eventually having type layout constraints that might allow this to be written more generically, saving the need to bitcast to an integer type as an intermediary:

extension AtomicStorage8 {
  public init<T>(_ value: T) where T: BitwiseCopyable, MemoryLayout<T>.size == 1 { ... }
  public consuming func dispose<T>() -> T where T: BitwiseCopyable, MemoryLayout<T>.size == 1 { ... }
}

Even though using a representative trivial type like IntNN is undoubtedly the easiest way to deliver this functionality in the near term, it doesn't strike me as being a fundamental tie to that representative type.

Joe_Groff · September 19, 2023, 2:16am

Internally, Swift has always had a distinction between types that don't ever need stable addresses and can always be passed around by registers, and types that need to have fixed addresses. This was necessary from the beginning for Objective-C interop because anything containing a weak reference needs to have a fixed memory location in order for the Objective-C runtime's weak reference table to point to it, and it's also necessary for C++ interop because C++ types can generally escape pointers to themselves whenever they want. Atomics internally use an attribute that also forces them into a "fixed address while being borrowed" state.

lorentey · September 19, 2023, 2:42am

Joe_Groff:

extension AtomicStorage8 {
  public init<T>(_ value: T) where T: BitwiseCopyable, MemoryLayout<T>.size == 1 { ... }
  public consuming func dispose<T>() -> T where T: BitwiseCopyable, MemoryLayout<T>.size == 1 { ... }
}

This would be nice. For compare-exchange operations to work correctly, we need fully defined bit patterns with no ambiguity, though. It isn't safe to assume that padding bits etc inside trivial types will always be consistently cleared, is it?

A boring way to reduce the coupling would be to just add multiple overloads for the signed/unsigned integer variants.

extension AtomicStorage8 {
  public init(_ value: UInt8) { ... }
  public init(_ value: Int8) { ... }

  // Shame about the disambiguating argument...
  public consuming func dispose(as type: UInt8.Type) -> UInt8 { ... }
  public consuming func dispose(as type: Int8.Type) -> Int8 { ... }
}

wadetregaskis · September 19, 2023, 2:46am

Tangential, perhaps, but looking at the examples in the proposal I keep thinking "I wish the syntax were better", e.g.: instead of:

counter.wrappingIncrement(ordering: .relaxed)

…just:

counter &+= 1 (ordering: .relaxed)

It's much closer to the natural syntax for the non-atomic version (i.e. it only adds the ordering specifier, and even that could potentially be optional if a default ordering is declared in the Atomic API). It also reduces the [human] parser load by reusing the well-defined existing operator for this operation, rather than introducing a bespoke method.

Not a big deal, obviously, but I think worth mentioning.

wadetregaskis · September 19, 2023, 2:50am

Why Synchronization as the module name, instead of e.g. Atomics? Will this module also [eventually] contain locks, semaphores, etc?

wadetregaskis · September 19, 2023, 2:51am

What's the point of atomicMemoryFence(ordering: .relaxed)? According to the proposal that will have no effect. Can it be made impossible to express that command syntactically, if it serves no purpose?