Atomics

Alejandro · October 3, 2023, 7:08pm

It would be worth looking into allowing atomics in those kinds of contexts, yeah! The only thing I'm unsure of is Atomic<Value> because the compiler could complain this will require a metadata initialization at runtime, but we force inline everything anyway (even in debug) so in theory there will never be any runtime metadata needed to use atomics (unless you're using a generic atomic in which case no, very bad, don't do this ).

rvsrvs · October 3, 2023, 10:52pm

Finally getting caught up on this thread and the latest proposal. In my current use of atomics I find myself frequently implementing throwing wrappers around the base calls. Looking at the proposal I'm wondering if I wouldn't prefer a throwing version of Atomic's implementation of:

public borrowing func storeIfNilThenLoad(
    _ desired: consuming Instance
  ) -> Instance

As constructed, if the caller cares whether the store succeeded or failed it must capture the object id of the desired Instance and then compare that to the object id of the returned Instance itself (which I suspect I'm going to forget to do pretty much every time).

tcldr · October 4, 2023, 9:03am

That's great. Right now I'm relying on looking at the disassembly and making sure there's no swift runtime calls. I'm no assembly expert, so this would be a breath of fresh air.

Oh, interesting. TBH I had only been expecting the load/store operations to be annotated. That would be great.

Yes, makes sense. Remarkably though (to me at least) the performance annotations seem to be able to pickup when a generic type has been/or will be specialised (or more importantly when it hasn't) and warn you if that's the case, too. Another reason I'm keen to get access to them.

Alejandro · October 5, 2023, 6:15pm

Hi all, after some further considerations I've decided to back out of the overload approach that I recently discussed here: Atomics - #35 by Alejandro. This approach has some really nice benefits, but ultimately the number of overloads was just too vast of an API surface which is one of the same arguments against ordering views. There are also other considerations that we looked into when deciding this that I added a new alternative considered for. The switch pitfall can still be easily encountered if someone is writing a custom atomic operation that takes a memory ordering and the body of that function isn't visible to outside modules even if the ordering argument is a constant expression. However, we feel this is somewhat of a niche case that in all other cases it will get eliminated. Another big factor for us is back deployability of new orderings that the constant expression case just handles infinitely better than ordering overloads (because the orderings are types themselves). I'll post the new section here which describes some of our thinking:

Memory Orderings as Overloads

Another promising alternative was the idea to model each ordering as a separate type and have overloads for the various atomic operations.

struct AtomicMemoryOrdering {
  struct Relaxed {
    static var relaxed: Self { get }
  }
  
  struct Acquiring {
    static var acquiring: Self { get }
  }
  
  ...
}

extension Atomic where Value.AtomicRepresentation == AtomicIntNNStorage {
  func load(ordering: AtomicMemoryOrdering.Relaxed) -> Value {...}
  func load(ordering: AtomicMemoryOrdering.Acquiring) -> Value {...}
  ...
}

This approach shares a lot of the same benefits of views, but the biggest reason for this alternative was the fact that the switch statement problem we described earlier just doesn't exist anymore. There is no switch statement! The overload always gets resolved to a single atomic operation + ordering + storage meaning there's no question about what to compile the operation down to. However, this is just another type of flavor of views in that the API surface explodes especially with double ordering operations.

There are 5 storage types and we define the primitive atomic operations on extensions of all of these. For the constant expression case for single ordering operations that's 5 (storage) * 1 (ordering) = 5 number of overloads and 5 (storage) * 1 (update ordering) * 1 (load ordering) = 5 for the double ordering case. The overload solution is now dependent on the number of orderings supported for a specific operation. So for single ordering loads it's 5 (storage) * 3 (orderings) = 15 different load orderings and for the double ordering compare and exchange it's 5 (storage) * 5 (update orderings) * 3 (load orderings) = 75 overloads.

	Overloads	Constant Expressions
Overload Resolution	Very bad	Not so bad
API Documentation	Very bad (but can be fixed!)	Not so bad (but can be fixed!)
Custom Atomic Operations	Requires users to define multiple overloads for their operations.	Allows users to define a single entrypoint that takes a constant ordering and passes that to the primitive atomic operations.
Back Deployable New Orderings	Almost impossible unless we defined the ordering types in C because types in Swift must come with availability.	Can easily be done because the orderings are static property getters that we can back deploy.

The same argument for views creating a very vast API surface can be said about the overloads which helped us determine that the constant expression approach is still superior.

bbrk24 · October 5, 2023, 9:07pm

How will the const-ness be enforced? Will it use the experimental _const keyword? It seems like what we want, but last I checked, that keyword is nowhere close to usable, even failing simple cases like

_const let foo = 1 // okay...
_const let bar = foo // nope!

so it’s definitely not adequate for this situation. On the other hand, I’m wary of it using some new thing as a special case in the compiler.

Alejandro · October 5, 2023, 9:27pm

Constness would be enforced by a special compiler semantic attribute:

@_semantics("atomics.requires_constant_orderings")

It is not a new thing, in fact this support was added to the compiler a couple of years ago and this is what the swift-atomics package is using as well. Semantically the _const parameter is what we want to use for this sort of thing, but we're not going to use that for these right now.

Zollerboy1 · October 5, 2023, 10:17pm

Alejandro:

Constness would be enforced by a special compiler semantic attribute:
@_semantics("atomics.requires_constant_orderings")
It is not a new thing, in fact this support was added to the compiler a couple of years ago and this is what the swift-atomics package is using as well. Semantically the _const parameter is what we want to use for this sort of thing, but we're not going to use that for these right now.

Would it be possible to switch over to using const instead of these custom semantics in an ABI-stable way, when we finally have const in the language?

Alejandro · October 5, 2023, 10:18pm

It should be very possible to switch over!

Karl · October 6, 2023, 4:05pm

Couple minor nits:

Can we nest the atomic storage types in a namespace enum?
DoubleWord might be ambiguous. On x86 and in the Windows API, dwords are 32-bit values.
loadThenMin , loadThenMax , minThenLoad, maxThenLoad read a little bit strangely to me because I don't think of "min" and "max" as mutating operations. What do people think of the slightly more verbose "SetTo{Min/Max}"?

In other words, the names would be loadThenSetToMin, setToMinThenLoad.

Again, super tiny nits.

wes1 · October 8, 2023, 1:12am

Apologies for a relatively long and radical suggestion from an outsider... but in case it helps...

TLDR: suggest configuring operations in design order and wrap them for use by function shape, handling the modify complications with enum's alone.

I find myself a little blinded when all the possible variants are flattened to method names.

I was intrigued that the proposal ordering types map to operation shapes:

Load: () -> T
Store: (T)
Update:
- CAS: (T, T) -> (T, Bool)
- RMW: (T) -> (T)

Except for RMW, each shape has only one operation.

With atomics I make mistakes thinking in the usual order data -> operation -> modifier (here ordering)

So I prefer thinking ordering -> shape -> operation -> data (hence the intrigue).

Also, once I configure a team of atomics collaborators for different thread contexts, I mostly want them to only do what the team decided they can do.

So my inclination is to design the API to guide configuration and to wrap up operations. That isolates the complexity to configuration phase (and that mostly to RMW).

Here's some (working but obviously mock) sample code:

let loadAcquire = AtomicLoadOrdering.acquiring
let address = Address<UInt64>()

// Basic: configure and use, to load value
let loadAddress = loadAcquire.load(address)
let value = loadAddress.load()

// Modify, the most complicated call

// Configure producer/consumer team
let release = AtomicUpdateOrdering.releasing
let addProduceGet = release.modify(address, .add, .getAfter)
let getConsume = loadAcquire.load(address)

// Use (imagining these calls in different contexts)
var updatedAndPublishedValue = addProduceGet.modify(23)
var readValue = getConsume.load()

Typically a library would wrap the complicated calls, but even the full sequence for most complex call is kinda clear:

AtomicUpdateOrdering.releasing.modify(address, .add, .getAfter).modify(23)

To see how an API like this would feel, you can play with the mock API (~200 lines):

Here's some elaboration on why this might be a good thing (with apologies for repetition)....

One mistake I make is to think of atomics as values. They are (fixed) addresses whose value can change at any time, and their key function is to share the address/reference across team members in different contexts. To me it is clearer to call them not AtomicValue<intType> but Address<intType>.

Another mistake is to take that address and try different things at each use-site.

Atomics code has a team-configuration phase where the decisions are made about how collaborators in different threads/contexts should use the atomic, and the use-phase where the atomic semantics don't change (often library vs client code).

In the configuration phase, the team agrees first on the type of ordering and the encoding of the value (and its shared address), then each usage is tailored for the collaborator role.

The key issue with atomics is not their runtime speed but getting the semantics correct, particularly since they can seem to work until they don't in ways that are hard to detect or understand.

Given all that, my API goals would be:

Ensure a correct configuration phase
Avoid inviting variations at the use phase
Otherwise permit any legal semantics
Avoid serious performance penalties

For configuration it should help to guide users to think of order first, and then what they want to put/get (the operation shape) for each member. So that's why it's interesting that the proposal ordering types map to operation shapes:

Load: () -> T
Store: (T)
Update:
- CAS: (T, T) -> (T, Bool)
- RMW: (T) -> (T)

Except for RMW, each shape has only one operation. These non-RMW operations may be considered "basic".

A configured operation is this shape plus the atomic address, e.g.,

struct Load<T> {
  let address: Address<T>
  let order: AtomicLoadOrdering
  func load() -> T {
    ...
  }
}

The RMW shape has 10 operations (add, min, ...) and peeking (before, after). All the complications are in the RMW shape (perhaps following a principle of progressive disclosure).

@frozen struct Modify<T> {
  let address: Address<T>
  let order: AtomicUpdateOrdering
  let op: ModifyOp
  let peek: BeforeOrAfter
  public let switchValue: Int
  func modify(_ using: T) -> T {
    switch {switchValue = careful init-time combination of parameters} {
      ...
    }
  }
}

You can offer (or only permit?) these as a function of the order type:


extension AtomicStoreOrdering {
  func store<T>(_ value: Address<T>) -> Store<T> {
    Store<T>(address: value, order: self)
  }
}

That's it for the API to get the decision pattern order -> shape -> operation, and to get the sample code above.

I find it nice that any complexity is directly modeled as a choice at the time it's relevant, instead of all possible choices being flattened into one catalog of methods. I also believe that the value itself is relatively independent of the ordering-operation decision, so it helps to preserve that distinction; in the code above, Address can (mostly) change T with no other changes.

At the use-site, I can imagine wrapping up the collaboration-design phase such that users can just ask for common sharing patterns even relatively independently of the type or data structures. (Go got a lot of mileage out of just channels and select.)

The goal is an API where you're lead through choices in the right order, and you can't make mistakes.

Limitations...

Obviously a difficulty is that the modify function involves a switch over 3 factors (order, op, and peek) (not to mention the size+sign implicit in the other shape-operation's). A common switch value can be calculated in the initializer and made public with the function for @inlinable @inline(__always). For this common value I prototyped encoding current values (plus consume-ordering and even size and sign), and it seems tractable (waving hand).

Also broken or not modeled in the mock API:

Confusion from the factory names matching the operation names: ..load(address).load()
Generics vs gyb implementation for Address<intType> or ...<storeType>
Overflow: wrapping vs balking variants
Some RMW operations are only relevant for unsigned types
- This can be put in the ordering factories with constraints on the address types if the operation enumerations and modify wrapper are similarly forked. In this case, changing from a unsigned Address to signed would (correctly) balk that the operations are no longer permitted.

Alejandro · October 9, 2023, 6:22pm

Sure, I'm curious to hear what others think about this. A natural name for me would be AtomicStorage.Int8 etc. but I don't feel too strongly about this distinction if others do feel strongly.
BinaryInteger is already sort of setting a precedent here with Words:

/// A type that represents the words of a binary integer.
///
/// The Words type must conform to the RandomAccessCollection protocol
/// with an Element type of UInt and Index type of Int.

So I actually got some similar feedback for this, but the suggestion was to essentially merge the loadThenX and xThenLoad functions to look something like this:

@discardableResult
public func min(
  with operand: Int,
  ordering: AtomicUpdateOrdering
) -> (oldValue: Int, newValue: Int)
... for all specialized integer and boolean operations

which would let someone choose specifically what value they wanted:

let (oldValue, _) = atomic.min(with: 10, ordering: .relaxed)
// or
let old = atomic.min(with: 10, ordering: .relaxed).oldValue

This has some advantages in that the surface area is much shorter and these names read more like the operation as well as the compiler can fully optimize out the newValue calculation if you never need it. wrappingIncrement and wrappingDecrement no longer need to be special cased and can just lean on the @discardableResult of the actual operation returning the old value and new value. I'm curious to hear what others think about this approach

Alejandro · October 9, 2023, 6:32pm

This is an interesting approach to atomics! I think this approach would feel too different to folks who are already used to using atomics in other languages however. C, C++, Rust, and others model atomics all sort of similarly and I think we should do the same (in a Swifty way!). A lot of folks are also used to the precedents swift-atomics has been setting for years now and to do something drastically different than that I think would be unfortunate. The APIs proposed here should feel very familiar to folks who have used atomics either in other languages or with the swift-atomics package.

wes1 · October 9, 2023, 9:56pm

Thanks for your quick response. I agree that factories-wrapping is too big of a change for those already used to swift-atomics and the low-level cpp-derived API (and also far too late for this proposal). It's mainly targeted at new adopters. (I would appreciate access to the Builtin's to validate the approach for a different audience.)

As a severable suggestion, is anyone convinced that AtomicValue should be renamed Address? It has nothing like Swift's much-discussed value semantics, and some aspects of reference semantics.

Joe_Groff · October 9, 2023, 10:05pm

Calling it just an Address would also be somewhat misleading, though, since although it may not behave like a value type, the storage for the atomic is established inline within the type or function context that the atomic is declared. And atomics are still movable, so when a noncopyable value that owns an atomic inside of it is moved, the atomic moves with it.

Alejandro · October 10, 2023, 4:24pm

I've gone ahead and made the change with merging loadThenX and xThenLoad with the specialized integer and boolean operations to the proposal.

Joe_Groff · October 10, 2023, 5:09pm

On platforms with double word atomics, should Unsafe*BufferPointer and Optional<Unsafe*BufferPointer> conform to AtomicValue as well?

Alejandro · October 10, 2023, 5:13pm

Aren't the optional types for those types 9/17 bytes large? (Exceeding the double word capacity) Making Unsafe*BufferPointer conform though might make sense

Joe_Groff · October 10, 2023, 5:26pm

Pointers have at least 4,096 extra inhabitants so there should still be room to add a layer of optionality in there, but if we don't take advantage of it, then allowing the non-optional variants to be atomic still seems useful.

ksluder · October 10, 2023, 5:35pm

I’m curious, how does Swift cram base address, length, and pointee type into a doubleword?

Alejandro · October 10, 2023, 5:37pm

Buffer pointers don't store pointee type, it's just a pair of base address and length. The "stored" pointee type is something we can directly grab from the generic parameter in the typed buffer pointers.