[Second Review] SE-0410: Atomics

Migrations are fun, but standard libraries are forever.

The names should be the best they can be, not hindered by happenstance of history.

27 Likes

Regarding AtomicValue, we're already having to introduce the concept of an "atomic representation" in this proposal, why not just mirror the naming of some existing types and call it AtomicRepresentable?

4 Likes

How about ManagedAtomic as a beginner-friendly class broadly identical with the current package and UncheckedAtomic as the inline structure?

This would not conflict with the existing UnsafeAtomic (though some confusion will likely still be caused) while emphasizing on documentation writers and potential users what the checks are and why it's important to maintain some invariants when using the Unchecked flavor (e.g. at least mark them as private and provide wrappers). It also keeps the door open for future revisions, once the let vs var issue is resolved.

The Checked vs Unsafe confusion was unfortunately already opened up in the continuations API, but I fear that too is now a happenstance of history. As an engineering effort, Swift is at most the sum of its parts, and one can only hope that this conversation belongs to the said set.

If the plan is to include higher level data structures such as queues in the future, the package name Synchronization feels unsuitable to me. Regrettably, I don’t have a better suggestion, just wanted to highlight this.

I believe the future additions to Synchronisation mentioned throughout the thread were things like mutexes and locks (akin to rust’s std::sync), queues would probably belong somewhere else if added (I agree with you that they’re not really synchronisation, more of just a useful data structure).

2 Likes

The language steering group discussed this review today, and we'd like to extend the review to January 2, 2024 with some follow-up questions regarding topics raised during this review discussion:

Sendability of Atomic<Unsafe*Pointer<T>>>

In the discussion about whether Atomic should be Sendable and for what types, the point was raised that the fundamental point of atomics is to share information across threads, and that making atomics non-sendable is more likely to only impose additional busywork on any use of them than to provide developers with any useful feedback about unsafety they don't already have by virtue of using atomics in the first place. This concern potentially extends even to types that can be used as AtomicValues but which aren't normally Sendable, particularly the Unsafe*Pointer family of types. Storing pointers in atomics is an important use case for lock free data structures, since by their nature it is unlikely that a safer alternative to plain pointers would readily fit the complex lifetime requirements of these structures. The fact that the types are already Unsafe is already a strong indicator that code is outside the normal bounds of Swift's safety guarantees, and we generally try to avoid introducing additional unnecessary roadblocks when a developer has clearly already opted into unsafe constructs, so the Language Steering Group sees merit in the argument that Atomic<Unsafe*Pointer<>> ought to be Sendable in order to enable its use without imposing additional ceremony. However, we have some related concerns:

  • We could say that Atomic is unconditionally Sendable regardless of the contained type. However, this would be a commitment that Atomic will be Sendable for all AtomicValue-conforming types present and future that either the standard library or Swift developers may want to implement. Are there types beyond the Unsafe*Pointer family that could be legitimately non-Sendable, legitimately usable as an AtomicValue, for which Atomic<> of that type would create an otherwise-avoidable safety hole if Atomic was unconditionally Sendable? If we want to prevent that, while still allowing pointers to be used as atomics, would we introduce another marker protocol to include all Sendable types and the Unsafe*Pointer types, and require that as a condition for AtomicValue?

    @_marker protocol SendableWhenAtomic {}
    @_marker protocol Sendable: SendableWhenAtomic {}
    extension UnsafePointer: SendableWhenAtomic {}
    extension UnsafeMutablePointer: SendableWhenAtomic {}
    /*etc.*/
    
    protocol AtomicValue: SendableWhenAtomic {...}
    

    Or, alternatively, would we follow Rust's lead in having AtomicPointer be a different type altogether?

  • A developer using Atomic<UnsafePointer<>> has already accepted the risks of working with atomics and passing pointers across threads. Beyond the pointer itself, though, is the data being pointed at, and the type of that data may not be safely Sendable. Should pointers to non-Sendable types be usable as AtomicValues, and if so, should Atomic<UnsafePointer<NonSendable>> type be sendable? Does the "you're already using unsafe constructs" argument extend from the pointer itself to the data being pointed to?

Naming things

The Language Steering Group does not foresee name changes as being an insurmountable issue for developers migrating their code from the swift-atomics package, so although the ease of that migration is a valid concern, we would like to prioritize establishing the best name possible for these new APIs. The Steering Group does agree on the following points:

  • The Atomic<T> type should be named Atomic with no additional qualifiers.
  • WordPair belongs in the same explicitly-imported module as the atomic APIs since the Steering Group believes its use should be tied to supporting double-wide atomics. Even though making it explicitly imported somewhat mitigates concerns about code completion pollution when developers reach for Double, we believe that WordPair is still a better name than DoubleWord on its own merits. WordPair emphasizes that the value is a pair of not-necessarily-related values, and the phrase "double word" has other common meanings, particularly its use on Windows to mean 32-bit values specifically due to the platform's history.

However, there are some naming issues we would like to continue receiving feedback on:

The AtomicValue protocol

Although the Language Steering Group had suggested the AtomicValue protocol be renamed AtomicWrappable as a result of the first review, the proposal authors pushed back on this recommendation, and the AtomicWrappable name also received a lot of negative feedback from reviewers so far, so we are looking for continued feedback on the name of this protocol.

The Synchronization module

The Steering Group also does not yet have consensus on the naming of the module that should hold these new APIs. We don't believe that this proposal needs to establish a consistent naming convention for modules that are part of the standard library but require explicit imports, since other proposals have already introduced standard library modules like RegexBuilder that don't follow a convention. Nonetheless, reviewers expressed dissatisfaction with the proposed name Synchronization. The name Atomics is already used by the swift-atomics module, and although language tools such as typealiases exist to cope with renaming declarations within a module, the language's tools for dealing with module name collisions are not as robust, so introducing a collision would be problematic for migrating existing code. We would like to get more feedback on the module name.

6 Likes

Huh. Wow!

I have a really difficult time understanding how the team arrived at this statement, as it goes directly against my lived experience using these constructs.

I did try to express this as clearly as I could, even within this review thread:

(Highlight added.)

To reiterate, I believe it is not at all practical to implement concurrent data structures by passing around naked borrows of Atomic instances, and Swift should not encourage such misuse.

Custom concurrent data structures must be built as named types that conform to @unchecked Sendable.

Having to spell out @unchecked Sendable for such types is in no way "additional busywork", "unnecessary roadblock", or "ceremony". To be frank, I find the repeated use of these dismissive qualifiers to be quite infuriating. In the LSG's studied opinion, what exactly is the purpose of @unchecked Sendable, then?

To me, the @unchecked attribute warns readers that the type implements manual synchronization, alerting them about the (immense) complexity of such code, and the highly elevated risk of bugs. Importantly, it does this without imposing any actual hurdles on the part of the author -- by simply adding @unchecked, the compiler gets out of the author's way.

To me, the only reason to even consider adding a Sendable conformance on struct Atomic is to allow the simplest possible use cases, where passing around naked atomics is in fact not entirely unreasonable -- with atomic integer counters being the flagship example.

The sendability of atomic pointers is utterly irrelevant to authors of concurrent data structures.

Artificially requiring Atomic<UnsafeMutablePointer<T>> to conform to Sendable would not just be pointless, but I strongly believe it would be actively harmful, by encouraging developers to treat these primitives as if they were actually safe to pass around like candy. These aren't candies -- they are poison!

Specifically, atomic pointers are subject to a litany of insanely difficult concurrency issues -- including the ABA problem, and the general issue of memory reclamation. In my opinion, waving these away as if they were only rarely relevant would be a regrettable mistake.

Note that an immediate (and quite urgent) followup to this proposal will be to ship atomic strong references, ported from the swift-atomics package, which provide a reliable solution to these issues.

Over-concentrating on atomic pointers would be a mistake. While it is crucial that we provide direct support for atomic pointers (as they are important primitives), in most cases they will emphatically not be the right building blocks for building concurrent data structures -- atomic strong references are a far more solid foundation to build on.

Atomic strong references are a generalization of AtomicLazyReference that allows arbitrary reassignments. Atomic strong references solve the ABA problem and they are able to safely release the instance they're holding after its last user goes away. They will provide a universal baseline solution to the problem memory reclamation, bringing Swift to (roughly) equal footing with garbage-collected languages. Swift's story on atomics is not really complete without them, because they are the constructs that put building concurrent data structures within reach to a sizable audience of Swift developers who don't plan on spending years to specialize on memory reclamation strategies.

Therefore, if we decide it's appropriate to do so, we can safely declare an atomic strong reference Sendable if its associated class type is Sendable -- in fact, the package has been doing exactly that. To be frank, applying the same conformance to raw atomic pointers seems like an absurd idea to me.

As I see it, the realistic choices are that we can either

  • simply not conform Atomic to Sendable at all (leaving the door open to adding it later), or
  • we can adopt the "usual" rules about container types, and have struct Atomic conform to Sendable when its value is Sendable. (After all, Atomic is semantically a container of a single item.)

The swift-atomics package has had two years of experience with the second choice. I don't remember receiving any feedback indicating that it was the wrong one. Strict concurrency checking has been a thing for a while now -- has anyone actually run into an issue with the Sendable conformances of swift-atomics?

Please let's not follow Rust in this regard.

FWIW, I'm happy with the names struct Atomic and WordPair.

On AtomicValue, I think it does makes sense to rename the protocol, as its requirements do not match the protocol that ships in the package. Although having to migrate code is never fun, I think some developers will want to maintain compatibility with both the old package and the new stdlib APIs, and it will be easier to conform to both protocols if they have distinct names. I think AtomicWrappable is a clumsy name, but it's workable. So is AtomicRepresentable. For fun, I'll throw Atomicable and Atomizable into the mix -- I don't think dictionary definitions make good API names, and frankly the prevalence of overlong compound names makes it difficult to read/understand Swift code sometimes.

On the module name: Synchronization made the most sense so far, as I expect there is a good chance we'll want to ship locks in the same module, too. (Like atomics, I don't think we should push locks into the default namespace of every Swift program -- although this is less about sheer complexity of these APIs, and more about the desire to discourage blocking APIs in general.)

Reusing the name Atomics from the package would be incredibly disruptive due to language-level design defects around module name conflicts. Whis is not necessarily a good argument (after all, it potentially applies to every new module that ever gets added to SDKs) but I think there is real difference between theoretical issues and a actual known problem that has no good workaround. So if we want to go this way, we'll need to tackle the resulting module name conflict somehow. (It's not just a technical issue either -- it'd be quite confusing that import Atomics would mean different things depending on context.)

6 Likes

Speaking only for myself, one practical consequence of (some) atomics being sendable would be that they can be readily used as nonisolated properties of actors:

actor Foo {
  nonisolated let counter: Atomic<Int>
}

I could see that being incredibly convenient for simple cases like passive counters, but it's probably sketchy to allow for anything more complex than that, especially anything involving pointers for the reasons you noted. And of course, even in the counter case, you wouldn't want to expose the atomic directly in your public API since you want to make sure that reads and increments are properly atomic and well-ordered even in a simple case.

4 Likes

I do fully* agree!

* With the caveat that most(?) of `Atomic`'s functionality is not actually appropriate (or usable) in such contexts. In particular, it seems unlikely that acquiring/releasing memory orderings can ever be validly used in these situations; and IIRC, allowing operations that use conflicting orderings (such as mixing relaxed and sequentially consistent operations) can result in ill-defined behavior that's bordering on UB. So there is _some_ level of inherent unsafety around exposing direct access to raw atomic values, even for "trivial" values like integers -- but the risk doesn't seem particularly urgent, and it feels like it can probably be waved away as "you have to know what you're doing".

On the other hand, the prospect of allowing this feels truly terrifying:

actor Foo {
  nonisolated let pointer: Atomic<UnsafePointer<Bar>>
}

(There is no way this code can be good. Looking at this induces a level of terror in me that is far deeper than even nonisolated let p: UnsafePointer<Bar> -- and if I remember correctly, Swift did decide to disallow that.)

2 Likes

not true, @unchecked Sendable applies to the whole type, so adding a single atomic stored property to an otherwise Sendable class means the whole class needs to opt out of Sendable checking.

:kissing_heart:

That is in fact exactly what I'm arguing. In general, types that contain atomic stored properties will opt out of Sendable checking, by definition.

They need to opt out of Sendable checking not because they use atomics, but because they want to do things that are, by definition, outside of the scope of the compiler's built-in sendability checking. Atomics is merely the vehicle by which they can achieve that. I very, very strongly believe that such types must be forced to admit this by spelling out @unchecked.

Therefore, I'm vehemently opposing the idea of adding an unconditional(!) Sendable conformance on struct Atomic.

The primary use case for atomics is to implement Sendable for types that cannot live within the regular sendability checks.

7 Likes

As far as I can tell, regardless of whether the pointer itself is atomic, it doesn’t change the fact that the memory it points to, the underlying memory,is not. And therefore all the same arguments apply that were given to revoke UnsafePointer’s Sendable status as seen at UnsafePointer Sendable should be revoked

I definitely appreciate the reminder that having to spell out @unchecked brings here, it clearly puts the responsibility on me, the programmer, to get things right.

Conditionally Sendable seems the obvious choice here.

2 Likes

i think we are fixating a lot on unsafe pointers, which are a clear cut example of where this logic works well, but the rule being proposed would actually apply to all types that merely fail to conform to Sendable (think ClientBootstrap).

since there are a lot of types that fall into this category, there are really only a few things you can do: you can add @unchecked Sendable to everything, even if it is actually unsound, to work around compiler warnings, or you can poke holes in concurrency domains to share things that wouldn’t otherwise be Sendable. the region-based isolation feature suggests to me that the language is moving towards the latter. as proposed, the conditional conformance rule for atomics is going in the opposite direction.

1 Like

It’s the other way around: region-based isolation identifies places where Sendable checks are applied but no sharing is happening. With Atomic, you can assume there will essentially always be sharing, or else you’d just use the value.

4 Likes

but sharing isn’t the problem, sharing of mutable state is the problem. and while Sendable is supposed to guarantee a "lack" of mutable state, there are plenty of non-Sendable types (like ClientBootstrap) that have mutable caterpillar and immutable butterfly phases and decline to encode the butterfly phase in the type system and essentially put the responsibility on the programmer not to share caterpillars.

my (highly reductive) perception as an :apple: outsider is that when Sendable was first conceived, it was thought that all sendable-with-a-lowercase-s things ought to encode that in the type system, and that making it easy to use sendable but not Sendable things was a non-goal. since a lot of library types fall into this category, that plan came with the implicit assumption that libraries were just going to have to change to adapt to the statically-typed Sendable world. that isn’t what ended up happening over the past two years (libraries, like all things that are Somebody Else’s Responsibility, have a lot of inertia!), so i tend to view things like region-based isolation as an attempt to climb down from that maximalist position.

1 Like

This characterization doesn't really match my understanding. Sendable was never about guaranteeing "no mutable state", it was about guaranteeing "no data races if shared". Actors conform to Sendable but are of course free to admit mutable state since they ensure all access is appropriately synchronized. And region-based isolation, though it may indeed have the benefit of making it easy to work with sendable-but-not-Sendable values, also solves the problem that there are not-sendable-and-not-Sendable values which may nevertheless be transferred across isolation domains in specific situations based on local data flow analysis. That's a worthy problem to solve even if every library author was perfectly diligent in applying Sendable as appropriate.

4 Likes

We’re talking about atomics. They’re going to be shared and they’re going to be mutated; if you don’t need to mutate them, you don’t need an atomic.

1 Like

the atomic reference might be mutated, but not the thing that the atomic references. a common use case would be propogating changes to a server topology. you might have an UnsafeAtomic<ClientBootstrap> that swaps out the bootstrap as the topology changes, even though you would never mutate the settings in the bootstrap itself.

1 Like

Right, but there’s no way for the compiler to know that. The region-based isolation proposal is still conservative—it’s a local escape analysis, and there will still be cases where @unsafe Sendable (or technically superfluous synchronization) is needed. The same should be true of Atomic: the places where it should be Sendable are the ones where the compiler can ensure it will not cause data races and type/memory unsafety.

sorry if this ends up being a mischaracterization, but based on what you’ve written so far in this thread, it sounds like your ideas could be summarized by the following:

  1. ClientBootstrap and similar pre-concurrency library types are (in retrospect) ill-adapted to modern swift concurrency.

  2. the standard library is forever, and formulating the rules of the standard library to fit around badly-designed library types would be totally nonsensical.

  3. the standard library should forget about ClientBootstrap and similar types, forge ahead with the new concurrency model, and if the language designers just hold firm, types like ClientBootstrap will die out on their own.

i don’t necessarily disagree with this logic. it made sense in the early days of swift concurrency when we had no idea how libraries would respond to the new model. but two years of real world usage have demonstrated just how stubborn some of these legacy types can be.

my question is, have we learned anything during this time that might justify a change in course, or are we still going to insist that all sharing should go through Sendable?

1 Like