[Second Review] SE-0410: Atomics

za_creature · December 5, 2023, 7:21pm

I wholeheartedly agree.

Does it though? (fit within the language model)

It already breaks let immutability for structs. This, as Kyle already pointed out, is already unsafe. I understand the underlying nuance, and agree that this subsequence of his argument is preserved for atomics:

I also believe that adding mutating func to a ~Copyable type would make the compiler enforce the same guarantees that atomicity is supposed to provide, making a var of said atomic type functionally useless.

That is true, but it doesn't come for free. It requires an additional notion of "this struct has a fixed memory address" (aka volatile) which is inherited by its owner.

While the current library is guilty of using "LLVM magic" to produce the expected effect, it did not (for various reasons) expose its version of Atomic as a struct, nor did it expose the concept of volatility. Per your initial argument, it is in fact a sound abstraction, albeit one that suffers from unnecessary pointer dereferences.

Ultimately, my $0.02 is that Swift can and should do better in this regard. Not just regarding atomics, but also in supporting fixed-size arrays as structs. To my knowledge there's yet no way to implement a cache-friendly, dynamically-sized linked list (7 items + 1 pointer to next per node).

Joe_Groff · December 5, 2023, 7:34pm

"Can be safely shared" is the fundamental guarantee a let provides, and atomics don't violate that, because atomic mutations are atomic and can occur even during shared access. It doesn't break the language model, it generalizes it for concurrency primitives: "immutable" and "mutable" really mean "shared" and "exclusive" in the general case. If you accept that, then sound APIs not only for atomics, but for locks and single-threaded shared mutable constructs can also be understood.

"volatile" is something else completely, a property of a load or store to mark it as not being eliminated by the compiler, which is independent of whether the operation is atomic or the storage is pinned to any particular address.

Fixed-sized inline storage will also come in hopefully the nearish future. The same underlying compiler mechanism could be used to implement it.

ksluder · December 5, 2023, 8:02pm

I think this gets to the heart of the issue. I would suspect most people think of let as a feature that allows the API designer to control mutability. @wadetregaskis made the point above that getting such control back requires API authors to write yet more wrappers, which Swift is already notorious for.

Joe_Groff · December 5, 2023, 8:13pm

But would it ever make sense to expose an atomic field directly as part of your type's API, without any sort of higher-level intermediate interface in front of it? If you give a client an Atomic<T>, then even before considering mutability vs immutability, you're expecting them to use the atomic API correctly. That is already a tall order, since they have to make sure the ordering of operations is correct, RMW operations and/or CAS loops are used appropriately, and generally be experts in lock-free programming to do so. Even if you were going to present a mutable interface to something backed by atomic storage, it seems likely that you would want to encapsulate the high-level operations that make sense as part of the public API.

wes1 · December 5, 2023, 8:21pm

All problems in computer science can be solved by another level of indirection, except for the problem of too many layers of indirection. [David Wheeler?]

As I see it, Atomics might end up actually clarifying this confusion, for lay and expert programmers, by being such a clear example.

Atomics by definition are values that can change at any time. There's no such thing as a constant atomic value. (The key feature is that they change entirely and for all shared accesses, according to the ordering.)

Atomics are only used by code teams; they're pointless except to share with other code. Changing the atomic they're using would be tantamount to leaving the team.

These are the two essential aspects of using atomics, and it's nice to have language support for both. The compiler will not optimize away repeated loads, and it should not permit team members to change their agreement (where discoverable).

Ordering is the variable aspect of atomics. The most common problem with atomics is that it`s very easy with current API's to set up mutual ordering semantics on the team that don't behave as intended. That presumably will be solved by library API's wrapping a team per-use-case, preventing library clients from changing team semantics. Wrapping up teams like this would be impossible unless you could replicate the atomic while preserving team membership.

(Here in this post I've deliberately punted on the abstractions - reference, address, etc.; whatever they end up being, the most important thing is to explain atomics as they are used, since that explains the abstraction.)

za_creature · December 5, 2023, 8:24pm

Can't have it both ways.

There's an let vs var argument where you want consumers to see the status of some process (e.g. pending vs in progress vs complete) without allowing them to influence it in any way.

Joe_Groff · December 5, 2023, 8:28pm

In this case, you should be able to, since we're talking about two different layers: the implementer who needs to use atomics ought to have a safe-as-possible foundation for doing so, but they also should present an outward interface for their own clients that is as safe as possible, and it's unlikely that that latter interface involves directly exposing atomics in most circumstances, so some amount of "wrapping" seems likely to be par for the course. In this case I don't think that's anything specific to Swift, since even C and C++ APIs should usually never directly expose their atomics as API to their clients.

za_creature · December 5, 2023, 8:33pm

We are in violent agreement. This is exactly the case of public private(set) var status: Atomic<T>, which cannot be implemented in the current proposal because all Atomic functions are borrowing.

Joe_Groff · December 5, 2023, 8:46pm

Part of what I'm saying is, even that wouldn't really encapsulate the Atomic well enough, since even if that restricted the use of status to calling load(), the client would have to know what ordering of load to perform. It seems to me that you would generally always make the atomic itself private, and expose a computed property to load it with the appropriate ordering. That pattern could be encapsulated by a property wrapper.

jrose · December 5, 2023, 8:50pm

Continuing the violence: there’s a reason computed get-only properties are still declared as var, and it’s because they change.

I really think the only way out of this is to accept that an Atomic’s value is its identity, and the thing you load or store from it is a contained, indirect value that happens to use inline memory.

za_creature · December 5, 2023, 8:53pm

I agree, in line with the case against default memory orderings. Which is why I feel that this still should be an inherently Unsafe feature, and that Swift is not yet ready for it overall.

Joe_Groff · December 5, 2023, 8:58pm

We use "unsafe" to mean "you get undefined behavior if you use this wrong". If you use atomics wrong, you can definitely get wildly difficult to predict behavior, but it's well-defined within the parameters of the memory model.

Rust has used pretty much the exact same model for atomics proposed here for almost a decade now. It would be interesting to hear from people's experience in that language community whether there is lasting confusion or problems caused by the model there.

ksluder · December 5, 2023, 9:04pm

Since structs don’t have identity, I think accepting this argument implies that Atomic should be a class. That would be perfectly in keeping with how lets of class type behave, and I think it would be great if non-copyable Swift classes could have inline struct-like storage.

Joe_Groff · December 5, 2023, 9:24pm

Noncopyable values do have some notion of identity by virtue of being unique, albeit not one that's permanently tied to a particular memory address the way classes currently are. Noncopyable-reference classes may be interesting but there's an open question of how many properties of copyable-reference classes they would carry: do they have a permanent stable memory address? An isa pointer for inheritance? An object header for binary compatibility with copyable classes? Atomics don't really need any of those other class abilities, so they still seem most at home as structs. If a noncopyable class is expected to have a stable address like a copyable class does, that would prevent it from being stored inline in its owner, since the owner can change over a value's lifetime.

jrose · December 5, 2023, 9:25pm

This is just not true. UnsafePointer’s value is the identity of the memory it points to; as is the value of the example struct I showed above. The thing that ultimately determines whether a type has value semantics or not is the type.

wadetregaskis · December 5, 2023, 9:57pm

It's not, though, is it? Because:

We have mutable reference types under let, and mutable values are often unsafe to share, in the most fundamental sense of "from this code to this other code" if not also from one concurrency domain to another.
Your type can be tied to the current task / thread / whatever in some way (e.g. rely on thread-local storage).

Isn't it Sendable that's supposed to represent "can be safely shared" in the limited definition of "transmitted across concurrency domains"? Last I checked I can't just have a let foo: SomeClass and pass that across concurrency domains - SomeClass has to be @Sendable, and as such it's really orthogonal whether it's let or var.

It's not just supposition, it's literally how let is defined in Swift at the outset, as I noted earlier. It is of course a lie, or at least very misleading, but:

Lots of beginner and weekend programmers have likely internalised that definition anyway.
Even experienced programmers have possibly rationalised it as meaning something else, than what this proposal is assuming, such as:
- It works as stated for value types, and for reference types it refers only to the reference (pointer) or "variable itself", not the value it points to. i.e. it's int* const foo, not const int* foo.
- It applies only to the part of the variable stored inline, which is (nominally) the whole of a value type but only the pointer to a reference type.

To be clear, I don't know what the best approach forward here is. Maybe the fundamental definition of var and let should in fact change - maybe even retroactively. But I'm concerned that the magnitude of such a change is seemingly being largely dismissed, or at least overlooked.

Re-educating the Swift community at large on such fundamental concepts & keywords is not a trivial undertaking. It also has ramifications on people transiting to Swift from other languages where historically it's been the case that 'constant' actually means constant.

Also, if you redefine away let from meaning constant, how then will we define actual constants?

Probably…? Although perhaps not in the sense of 'API' you're using, I think. Think instead about between functions more generally, not just those that are formally an API boundary. e.g. you have some helper function that works with an atomic value (as a parameter). Nominally you'd control mutability in the parameter definition - read-only if unadorned, mutable if inout. i.e. the function parameter equivalents to let and var, respectively. Well, for value types.

I don't think they're a great ambassador for such a dramatic change in core concepts, because use of atomics will be relatively rare. I think they'd add to the confusion, not clarify it, because people would encounter them rarely enough to think of them as weird exceptions to expectations, not redefiners or clarifiers there-of.

Indeed. And that harks back to a discussion from the first review about whether there could / should be built-in wrappers for that, so you can vend a e.g. AtomicValue.LoadAcquiring view. The rationale for not doing that made sense to me at the time - that it is a lot more boilerplate and general work for the atomics library - but then as you keep pointing out, it does seem like that kind of wrapper is going to be needed in a lot of cases, so punting it from the stdlib to myriad 3rd party codebases is perhaps not a net win.

True in the last part at least, which leads to…

That makes a lot of sense to me. It definitely feels more true to their nature to make Atomics classes. And having the general ability to inline classes - e.g. some @StorageInlinable attribute - would be useful in other contexts too.

But, I know that's come up before and I don't recall seeing any tangible technical roadmap for doing that, given the issues with heap storage vs reference types…?

Although classes can't have value semantics, can they?

I think Kyle's point was that Atomic seems to fit reference semantics much better than value semantics, and therefore is more natural as a class than a struct. Even if it's possible to make a struct behave a lot like a class in certain respects.

I don't want this to be overlooked. Alas I personally don't have sufficient Rust experience, or know anyone that does. Do we have an actual way to obtain that information from the Rust community?

ksluder · December 5, 2023, 10:07pm

I’m not sure what is untrue about what I said. Surely you don’t dispute that structs lack identity? That’s kind of their defining characteristic. Two UnsafePointers with the same value are indistinguishable, meaning they cannot be identified from each other, whereas two objects that are equal can still be distinguished by their ObjectIdentity.

za_creature · December 5, 2023, 10:23pm

Eh, it gets a bit murkier in the context of ~Copyable, see SE-0390: noncopyable type `deinit`s, mutation, and accidental recursion (with apologies, it was my first forum post)

jrose · December 5, 2023, 10:23pm

It’s true that Swift says all class instances have identity, but it is not the case that all non-classes lack it. The simplest example is closures, of course, which can be demonstrated to not have value semantics if their captures can be mutated. But if a struct is non-Copyable and non-Equatable, what makes it “indistinguishable” from another instance of the same struct? If it owns a file handle, you’ll see different seek positions; if it owns a memory buffer, you’ll see different contents; if it owns a class instance, you’ll get reference semantics unless the implementor of the struct specifically avoids it. Which is how String and Array work, as we know.

Once structs can have deinits, they have some notion of identity, because the deinit will be called at most once per instance.

You are not the implementor of Atomic; you are a client. From the outside, Atomic presents the interface of a container. That container does not change as you manipulate the value inside it.

EDIT: we could call it InlineAtomicBox or something to make this clearer, I suppose. I don’t personally think that’s worth it but it’s a possibility.

lorentey · December 6, 2023, 3:57am

I don't agree that avoiding atomic operations on let variables would be a worthy goal. I think allowing (or requiring) atomic operations to be declared mutating would be particularly bad -- mutating operations are deeply tied to exclusive access in Swift, and having them sometimes not be subject to the Law of Exclusivity would be deeply confusing.

SE-0282 introduced the idea of "atomic access", so that the swift-atomics package could exist without flagrantly violating Swift's memory model:

Two accesses to the same variable aren't allowed to overlap unless both accesses are reads or both accesses are atomic .

That proposal went on to define atomic access as a call to one of the standard C atomic functions. This was arguably unnecessary, as these functions take pointer arguments, and Swift does not attempt enforce the Law of Exclusivity over pointees of unsafe pointers -- but I thought it was very important for the memory model to admit that under certain narrow conditions, concurrent read-write access is sometimes okay. I pushed for landing this superficial change instead of rolling back the entire proposal, because of two reasons:

I felt that the name "read access" is a terrible label for atomic operations, and (more importantly),
I wanted to avoid allowing regular reads to overlap with atomic updates.

Strictly speaking, the atomic operations in this proposal are not "atomic accesses" in the sense of SE-0282 -- they are categorized as "read accesses". Tearing reads are still avoided, but this is implemented and enforced on the library level -- struct Atomic simply never exposes any operation that engages in conventional nonatomic reads. I think this is a workable library-level solution, but I suspect we'll eventually want to re-adjust the Law of Exclusivity to allow these operations to continue to be categorized as "atomic access", distinct from "read access".

Arguably, the idea to disallow var variables of type Atomic is scratching at the same issue -- i.e., trying to make sure that the Law of Exclusivity will not interfere with the use of atomics.

We will need to expose the mechanism to opt out of inout accesses as public, and we will need to do it soon -- I expect people will want to create custom lock-free data structures, and they will want them to have the same goodies as struct Atomic, including the ability to forbid inout use.

We can certainly continue to model the operations of all these concurrent types as borrowing, leaving it up to the type author to avoid race conditions. (As it is going to be their responsibility anyway. struct Atomic works hard to avoid unnecessary pain, but building a correct concurrent type is still going to be extraordinarily tricky.)

The question is: are we happy to keep using this approach in the long term?

In this scheme, the syntax to declare a concurrent type seems quite unwieldy/haphazard:

/// A dynamically sized, concurrent, insert-only, lock-free dictionary of
/// cached values.
@_staticExclusiveOnly // FIXME: Expose as public
struct MyConcurrentCache<Key: Hashable & Sendable, Value: Sendable>: 
  ~Copyable, 
  @unchecked Sendable 
{
  init(initialCapacity: Int)

  deinit

  /// Atomically set a value stored in the cache, if its key isn't already
  /// present. Returns the value associated with the key after the
  /// operation. If the cache is at capacity, inserting a new value will
  /// allocate more storage.
  borrowing func setValueIfNotPresent(_ value: Value, forKey key: Key) -> Value

  /// Atomically query the cache.
  borrowing func lookup(_ key: Key) -> Value?
}

The struct keyword is no longer providing readers useful guidance on what these types actually are -- their true nature instead arises from a combination of weird-looking protocol conformances (~Copyable/@unchecked Sendable) and load-bearing attributes. Is this really the best syntax to declare such types? How many more knobs are we going to end up adding to this pile?

(Then again, having to define such types will hopefully remain rare, so perhaps not having more direct syntax is going to be just fine.)

The way I see it, non-copyable structs are not value types, and cannot ever behave like value types. Non-copyability is a rejection of the idea of value semantics, including all of its numerous benefits (as well as its drawbacks).

The direct precursor of struct Atomic<T> in this proposal is class ManagedAtomic<T> in the swift-atomics package, which is a reference type. (And the direct precursor to that is std::atomic in the C++ standard library.)

These types all try to implement the same core abstraction; the only difference is that struct Atomic uses recently introduced Swift features to replace ManagedAtomic's refcounted heap allocation with inline-allocated storage -- moving it closer in spirit to std::atomic, and finally arriving at the place we wanted to reach all along.

Take this piece of code, using swift-atomics. It has been working since Swift 5.1:

import Atomics
let number = ManagedAtomic(42)
print(number.load(ordering: .relaxed)) // ⟹ 42
number.store(23, ordering: .relaxed)
print(number.load(ordering: .relaxed)) // ⟹ 23

This code is modifying an atomic value stored in a let variable. The store operation is not declared mutating (indeed it cannot be, as class types do not allow mutating members), but it is still able to change the "value" of the atomic. This is not a problem; it's just how Swift works.

Changing ManagedAtomic to Atomic makes no difference -- the code still works, and it still does the same thing:

import Synchronization
let number = Atomic(42)
print(number.load(ordering: .relaxed)) // ⟹ 42
number.store(23, ordering: .relaxed)
print(number.load(ordering: .relaxed)) // ⟹ 23

Changing to Atomic gets us a very different (much better!) low-level implementation, but fundamentally, the code still means the same thing -- it is just expressed in a new way that fits better with how we want/expect these things to work. Most use cases of ManagedAtomic do not depend on its copyability (indeed, copying ManagedAtomic references is generally undesired, and usually only happens by accident), so for most users, struct Atomic will be a drop-in replacement for class ManagedAtomic. Semantically, both types are containers, holding a single value at a time.

To me, non-copyable structs therefore feel much closer in spirit to reference types than values.

But an instance of struct Atomic is neither a value, nor a reference -- it just is. (It certainly has a value, but its value is slippery, and it is very distinct from the thing itself.)

The way I see it, non-copyable types give us a more direct way to model "objects". It feels they attempt to directly model the things that reference types are referencing, elevating them to the status of first-class citizens. They cut through the intricate conventions of Swift's class types (reference counting, isa pointers, subclassing, virtual method tables, yada yada yada): they throw them all out, replacing with new constraints that can be (mostly) enforced at compile time. (Of course, there were excellent reasons behind all that runtime complexity; getting rid of it reintroduces ancient problems that have been long resolved -- but now we can solve them in a fashionable new way, hopefully this time avoiding the problems that classes did not solve!)

I don't see how this would work. Atomicity is a property of functions. I don't really understand what an "atomic variable" would be. On the other hand, to me it may perhaps make sense to declare specific types as atomic, making an en bloc guarantee about their operations.