SE-0282: Low-Level Atomic Operations

I agree with you if that is true, we should call it something else. I'm curious though: why wouldn't we be able to define this as a frozen move only type in the future if it were left fully resilient and had no public instance members?

The inability to do that seems like a huge hole in the resilience model if true.

I don't think this is true. You're right that there is an issue with & on an ivar according to Swift's abstract machine model, but there are lots of other places to obtain pointers, including malloc() as the proposal observes. Moreover, but the proposal as written and the "things are static methods" approach has the same problem with the machine model.

I agree with you that those two options seem like the only reasonable ones, and I also really dislike the class-based model now that I've had some time to think about it.

I'm curious though, why not provide access to the static method APIs? This is effectively a C-like interface. Adding them would be incredibly valuable to Swift today, and is a bridge to the future. If you're worried about taking the Atomic name, then we can squirrel them away somewhere else - we could even just provide the AtomicValue protocol and the conformances to it, and have people access it directly. For a low level feature, the sugar is just a "nice to have".

One issue I can see is that any type you declare today is going to be assumed copyable, but maybe that's not a practical concern if there are never any values of the type accessible to programs. More practically, having a type already named Atomic compiled into binaries today might also impede our ability to back-deploy a different "real" implementation of the type with the same name using a static library shim or other tricks.

One problem is that this feels like a bridge that we would probably want to burn once we have crossed it. Once we have working move-only atomics, why would we need to continue exposing the standalone pointer operations? This cuts UnsafeAtomic too -- it provides essentially the same model as the static pointer-based methods, just rearranged slightly in preparation for an interim solution with @RawStorage. I think UnsafeAtomic is the better approach, because it allows us to build a (wobbly) bridge into a scaled-back version of the move-only future sooner. However, we should have a standalone discussion on this plan!

The static methods are right there in the _PrimitiveAtomic protocol -- we could expose them directly.

let value = Int._AtomicStorage._atomicLoad(at: ptr, ordering: .relaxed)

However, this highlights the problem of separating the logical atomic values from their storage representation. Allowing these two things to diverge and (more or less) transparently converting between the two is one of the useful things UnsafeAtomic does behind the scenes.

(This is less important for integers, but it's rather essential for (optional) pointers, and it's especially tricky to get right for atomic strong references such as AtomicLazyReference (and the eventual double-wide atomic reference implementation).)

We can choose to have the @RawStorage discussion right here on this thread, too, of course. (It feels inline storage is a prerequisite of usable atomics, but it isn't really part of the proposal text, and it seems largely orthogonal to atomics in general. I worry that all this talk on memory management has derailed the review, and it is scaring away folks who would want to discuss the minutiae of atomic operations.)

The problem is that we want to carve out storage space within class instances to store atomic values. To do this, we need a way to reliably retrieve the address of their memory location.

In a nutshell, I see three potential approaches:

  1. Fix the & conversion somehow to keep the syntax everyone is trying to use right now
  2. Add keypath-based access to storage locations, such as the MemoryLayout.unsafeAddress(of:in:) method in the original addressable ivars pitch
  3. Add a magical attribute-based solution like @RawStorage

Let's quickly go over these one by one. One way to try fixing & would be to introduce an attribute to reject cases where the inout-to-pointer conversion doesn't use a direct storage location, such as @stableStorageLocation below:

extension Int {
  struct AtomicStorage { ... }
  static func atomicLoad(
    at address: @stableStorageLocation UMP<AtomicStorage>, 
    ordering: AtomicLoadOrdering
  ) -> Int
}

Int.atomicLoad(at: &someComputedProperty, ordering: .relaxed) 
// error: 'atomicLoad' needs directly addressable storage for 'address'

(This is in the same ballpark as the @_nonEphemeral attribute we have right now, but it produces errors, not warnings, and it allows the use of & when it happens to generate a pointer that is "safe" to escape.)

The problem, of course, is that the & syntax implies that the variable is being mutated, and write access conflicts with atomic access:

class Counter {
  var _value: Int.AtomicStorage
  
  func load() -> Int {
    // Blatant exclusivity violation: 
    // atomic access overlaps with a write access
    Int.atomicLoad(at: &_value, ordering: .relaxed) 
  }
}

I think this rules out &; saving it would require major surgery on the law of exclusivity. (We could try forcing it by saying that the inout-to-pointer conversion completes the write access before the function call begins, but that would be unlike how regular inout arguments work, and I think it would just lead to even more confusion.)

The MemoryLayout.unsafeAddress(of: \._value, in: self) idea in #2 above could get rid of the exclusivity violation, since we're free to define what sort of access (if any) it entails. However, the (very reasonable) feedback on the pitch thread was that this would be far too dangerous -- it would allow code to circumvent exclusivity checks on any ivar by simply switching to accessing it through unsafe pointer operations. So we should rather go with an opt-in approach, where ivars would be explicitly annotated with some attribute that exposes their storage location.

#3 is the obvious next step on that path: it hides the actual mechanics of extracting and passing around pointers behind an attribute that works a bit like property wrappers:

// This probably wouldn’t actually be a protocol; rather it would be a 
// compiler-enforced “shape” like @propertyWrapper.
protocol RawStorable {
  associatedtype Storage: RawStorage // comes with init(_:) and dispose()
  init(at: UnsafeMutablePointer<Storage>)
}

extension UnsafeAtomic: RawStorable {}

public class Counter {
  @raw var _value: UnsafeAtomic<Int>

  // _value is a computed property of type UnsafeAtomic<Int>
  // $_value is the underlying ivar of type UnsafeAtomic<Int>.Storage

  init(_ initialValue: Int) {
    $_value = .init(initialValue)
  }
  deinit {
    $_value.dispose()
  }
  func load() -> Int {
    _value.load(ordering: .relaxed)
  }
  // ...
}

A slightly (?) more elaborate version of this would hide $_value and let the compiler autogenerate the boilerplate-y initialization and disposal of the backing storage:

public class Counter {
  @raw var _value: UnsafeAtomic<Int>

  init(_ initialValue: Int) {
    _value = initialValue   // Note the weirdly mismatched types
  }
  // a call to dispose() is generated by the compiler at the end of deinit

  func load() -> Int {
    _value.load(ordering: .relaxed)
  }
  // ...
}

Either of these last two approaches get us some of the practical benefits of move-only types without having to wait for their implementation. When Atomic<Int> becomes a thing, my hope is that code using UnsafeAtomic this way can simply upgrade to that with a simple, (more or less) mechanical migration step:

public moveonly struct Counter {
  var _value: Atomic<Int>

  init(_ initialValue: Int) {
    _value.init(initialValue)   // How exactly are we going to spell this?
  }

  func load() -> Int {
    _value.load(ordering: .relaxed)
  }
  // ...
}

I hope this explains why I’m against naked pointer-based methods like Int.atomicLoad(at:ordering:). We will need to tackle inline storage soon, and naked pointer APIs won’t fit into the most likely design for that. If we introduce them now, we will end up also introducing something like UnsafeAtomic later, and then we’d have two separate unsafe APIs for the exact same thing.

Of course, unsafe pointer-based atomics (either through UnsafeAtomic or direct pointer apis) have some reason to exist on their own right, even after we introduce Atomic. They also interoperate with manually malloced dynamic variables, ManagedBuffer, withUnsafeMutablePointer(to:), pointers coming from C, and any of the other weird & wonderful ways people may get hold of pointer values. UnsafeAtomic has an additional long-term benefit — I expect that retrieving the address of ivars within move-only types will be similarly difficult, so the eventual move-only Atomic type will most likely still use @raw and UnsafeAtomic in its internal implementation.

1 Like

I share @lorentey’s concern here. A surprising amount of my time is spent policing the pointer management code of programmers who do not understand the way pointers work in Swift. Some of this is simple (UnsafePointer(&x)), some is trickier (pointer lifetime management).

This proposal would add another rule: any pointer vended by a Swift CoW struct is ineligible for being used as an atomic. This is for multiple reasons: it likely violates the rule of exclusivity, and even if it didn’t the memory location is not stable across the required multiple-ownership state needed here.

So far as I know there are only two safe places to get a pointer that can back an atomic from today: malloc and ManagedBuffer, as well as their spiritual cousins and indirections to them (i.e. memory allocated by C libraries, maybe). Have I missed some other source? If not, why not try to discourage using the many other ways to obtain a pointer that will lead to either subtle or not-subtle breakage?

1 Like

Hi Karoy,

I think the best way to handle this is to expose the static members on the protocol (and provide conformances of standard library types to it). This is all that is required in this step to achieve your goal laid out in the motivation of the proposal:

These new primitives are intended for people who wish to implement synchronization constructs or concurrent data structures in pure Swift code. Note that this is a hazardous area that is full of pitfalls. While a well-designed atomics facility can help simplify building such tools, the goal here is merely to make it possible to build them, not necessarily to make it easy to do so. We expect that the higher-level synchronization tools that can be built on top of these atomic primitives will provide a nicer abstraction layer.

This avoids all of the questions about how best to expose the user-facing functionality, while providing the core abstraction required for people to start experimenting with it. We can standardize one or more of the user-facing APIs once we have implementation and usage experience with them. This becomes possible when the core mechanics are available to general Swift programmers, which the protocol does.

WDYT?

-Chris

These interfaces do not make good public API, even tucked away as they would be in an obscure module. The time is definitely right to carefully expose the inherent complexities of atomics, but pointer-based atomics introduce an unconscionable amount of extra complications on their own. To use these correctly, one has to be an expert at both atomics and the (underdocumented) Swift execution model — and as this thread has clearly demonstrated, mistakes will slip in even then.

I’m happy to do a cleanup pass on the pointer-based atomic methods to make sure that they are usable for the handful of people who may be able to responsibly use them; but I strongly believe these need to remain underscored.

There is but one way to expose atomics that actually fits well in the language we have today, and that’s class ManagedAtomic<Value>. Going with that as the single public atomic construct will considerably simplify and focus the proposal, letting it concentrate on atomics.

Class-based atomics will work as an excellent stand-in for Atomic<Value>. The heap allocations will limit their usefulness, but as always, the responsible choice is to prefer correctness to performance. And, as Joe aptly observed, managed atomics generally won’t incur ARC traffic during actual use; the overhead is mostly limited to init/deinit.

Sounds good?

2 Likes

I think you're conflating two very important and very different things:

  1. Atomics are fundamentally UnsafeMutablePointer based. There is nothing involving extra complications - this is the inherent complexity of how these operations work, e.g. at the LLVM IR and C levels of abstraction.

  2. Swift has completely separate parts of the language that convert some things to UnsafePointers in places that may have dangerous or unexpected lifetime implications, e.g. the & on an ivar example discussed upthread.

You see very focused on #2, but there is nothing you can do in an atomics API that "solves" #2 completely, there are just different ways of attempting to sweep the issue under the rug (and I don't think the original proposal was particularly successful at this). The right way to solve #2 are new language mechanics that are completely orthogonal to atomics.

I think that #1 is completely solvable, extremely valuable, and very important. That is why I'm recommending that this proposal focus on it.

-Chris

1 Like

Hey Chris,

APIs don't exist in a vacuum. It must be possible to productively use them to solve real-life problems.

Can you please show me a piece of code solving some toy problem that uses pointer-based atomics without separate heap allocations for every atomic value? I've been looking at this for (on and off) half a year now, but I have so far failed to make an example that isn't broken, ridiculously overcomplicated or both. (There is a reason why the proposal doesn't show how to implement inline storage through ManagedBuffer or MemoryLayout.offset(of:).) Something like the proposal's silly lock-free single-consumer stack should be enough to illustrate how these APIs will work in practice.

I believe ManagedAtomic is the sweet solution that lets us move ahead until the language matures. Again, for the two or three full-time Swift engineers who think they may able to directly use pointer-based APIs, they will be available as public-but-underscored methods. Trying to document how they work is a fool's errand at this point.

There is plenty of room to expose pointer-based APIs in followup proposals, as soon as it becomes possible to responsibly do so.

Thanks,
Karoy

1 Like

My concern with adding class-based atomic types is that it puts us on a track to committing to three different APIs for the same thing—the "unsafe" low-level API, a stopgap class-based API, and a future safe move-only API. The unsafe API has reason to exist in the future, if nothing else as a mechanism for implementing the move-only types, though I think it will remain interesting even when move-only atomics exist, for more specialized cases that don't fit cleanly in the confines of the safe model. The use cases for a class-based API are at best questionable today, because of the performance concerns with any doubly-indirected design that Chris raised, and I think they would evaporate pretty much completely once move-only atomics exist. A ManagedAtomic class might be a great package to ship on top of the low-level atomics API, but I don't think it belongs in the long-term API of the standard library.

5 Likes

Here is the hero shot of pointer-based atomics:

import Atomics
import Dispatch

let counter = UnsafeMutablePointer<Int.AtomicStorage>.allocate(capacity: 1)
counter.initialize(to: Int.atomicStorage(for: 0))

DispatchQueue.concurrentPerform(iterations: 10) { _ in
  for _ in 0 ..< 1_000_000 {
    Int.atomicWrappingIncrement(at: counter, ordering: .relaxed)
  }
}
print(Int.atomicLoad(at: counter, ordering: .relaxed))
Int.disposeAtomicStorage(counter.pointee)
counter.deinitialize(count: 1)
counter.deallocate()

This is objectively terrible. I'm really not looking forward to trying to rationalize this mess. If we must do this, please let's at least do it through UnsafeAtomic.

Now of course, integers don't need a custom dispose implementation, they have trivial storage, and on current platforms they can work with AtomicStorage = Self. Do we only care about integers?

For the low-level API I think the answer is yes, we can probably get away with only integers. For the low-level API we use integers as a substitute for "opaque bits stored in memory".

However, inviting users to perform atomic operations on non-integer values (even trivial, for example, pointers) is muddying our memory model with regards to aliasing -- however the issue seems solvable to me, we already have operations that allow rebinding the memory.

1 Like

Hi Karoy,

I'm not interested in toy problems. Atomics are an advanced feature that only make sense to use if you're managing locality and other things at the same time. The only way to express this sort of thing is with unsafe pointers (or clang-imported types) in Swift today anyway.

My most recent intersection with significant atomics use was with the TFRT project which is not yet open source (though it is "coming soon" I'm told). The only details are a high level talk about it here.

That project is written in C++, but would have been much nicer if written in Swift and would use pointer based atomics like crazy. It includes things like:

  1. A futures implementation with user-customizable memory management that would have to be backed by unsafe pointer. Atomics are used for the reference count and other things. See the "Key Abstraction: AsyncValue" slide.

  2. Several helpers that depend on isolated atomic counters, e.g. for closure-based fork-join parallelism helpers.

  3. An "executor" for executing dataflow graphs that use counters per node, which use an "array of atomics" like pattern.

None of these use-cases would benefit from the higher level types or wrappers around them, because there is enough inherent complexity in the concurrency problems they were addressing.

I'll remind you that concurrent mutations in Swift are inherently unsafe. A future move-only Atomic type will not make concurrent mutations safe in Swift. UnsafePointer is the tip of the iceberg in terms of the issues involved.

-Chris

In implementing the Michael-Scott queue algorithm (a toy example in more than one way), the queue object's declaration I arrived at looks like this (excuse my atomics layer):

final public class LockFreeQueue<T>
{
  let storage = UnsafeMutablePointer<AtomicTaggedMutableRawPointer>.allocate(capacity: 4)
  private var head: UnsafeMutablePointer<AtomicTaggedMutableRawPointer> { return storage+0 }
  private var tail: UnsafeMutablePointer<AtomicTaggedMutableRawPointer> { return storage+1 }
  private var poolhead: UnsafeMutablePointer<AtomicTaggedMutableRawPointer> { return storage+2 }
  private var pooltail: UnsafeMutablePointer<AtomicTaggedMutableRawPointer> { return storage+3 }

  public init()
  {
    let node = Node<T>.dummy
    let tmrp = TaggedMutableRawPointer(node.storage, tag: 1)
    CAtomicsInitialize(head, tmrp)
    CAtomicsInitialize(tail, tmrp)
    CAtomicsInitialize(poolhead, tmrp)
    CAtomicsInitialize(pooltail, tmrp)
  }
  // actual algorithm skipped for brevity
}

The node is a wrapper around an UnsafeMutableRawPointer, with accessors returning the fields, an atomic tagged pointer and an atomic pointer; a single allocation for 2 more atomic values.

It might qualify as overcomplicated, though I don't see a simpler way at the moment.

This makes me start to think that a selected "good enough"/"for now" API should live in a non-stdlib package. Given the number of asterisks we're putting next to all of the proposed variations, "a separate import so that people don't find it on accident" still feels too close-at-hand.

Sure, I know a capital-P package won't be able to access the compiler intrinsics; but nobody's died from the C bridging status quo. What they have been suffering from is from lack of input from domain experts w.r.t. Swift's memory model (+ encoding that expertise into the language or static analysis, like the advent of @_nonEphemeral), address pinning for members, etc. What I would see as a win from this effort is interested parties being able to have a definitive place to even look.

As much as I want better atomics, and how I like the proposal as pitched, I don't think it's a failure to say that there isn't currently an answer for which the stdlib is comfortable making a compatibility guarantee about.

2 Likes

What is your evaluation of the proposal?

-1

I was neutral on this proposal, because I'm more likely to use higher-level APIs. But the negative feedback from library authors is worrying.

I'd like to see a revised proposal, which moves most of the APIs into a swift-atomics package. This would be a similar concept to the swift-numerics package — an umbrella module, and separate modules for ManagedAtomic, UnsafeAtomic, etc.

Only the APIs which wrap Builtin atomics (for integers only) would be added to the standard library (in the Swift module):

  • the static methods of AtomicIntegerConformances.swift.gyb
  • the structures and top-level function of AtomicMemoryOrderings.swift

All of the protocols (including _PrimitiveAtomicInteger) would be added to the package.

Is the problem being addressed significant enough to warrant a change to Swift?

Yes, but the solutions could be explored in a package.

Does this proposal fit well with the feel and direction of Swift?

No, adding a new module to the standard library doesn't seem necessary.

How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

I studied the proposal and implementation.

This is certainly on the roadmap. (Note: doing it as a new load ordering would allow easy back deployment but it may not be the right approach.)

AIUI, adding tearable atomics right now could potentially diverge the Swift memory model from that of C++, which doesn't seem like a good idea yet (as of this proposal, we don't provide any formal semantics beyond "see the C++ standard, and beware of the Law of Exclusivity"). The implementation will need to come with tooling/sanitizer etc. support for such operations.

1 Like

The review for SE-0282 Atomics ran from April 14 to April 24, 2020. The core team has decided to return this proposal for revision. During the review, support was nearly unanimous for the memory model the proposal establishes, bringing Swift in line with the model standardized by C. The core team concurs with the review discussion on this subject, and would like to see a revised proposal that focuses on specifying the memory model. Guaranteeing a C-compatible memory model allows developers that currently wrap atomic primitives written in C and import them into Swift to rely on this continuing to work. This would also provide stable ground for building atomics packages outside of the standard library for experimentation and use by early adopters. The Swift project itself plans to develop one of these packages.

The most intense review discussion on SE-0282 centered on the best paradigm in which to expose these atomic operations in Swift, both now and in the future. The core team would like to see the community develop more implementation experience before committing to a standard library API for exposing atomics. There is general agreement that a move-only Atomic type should be the ultimate goal, similar to what has successfully been implemented in Rust; however, the best we can hope to implement in Swift as it exists today without fundamental overhead is something that performs atomic operations on unmanaged memory through pointers. The proposal manifests this as an UnsafeAtomic<T> type, which wraps a pointer, along with AtomicProtocol and AtomicInteger protocols with implementation-defined requirements to which eligible types conform. The review discussion raised concerns about this approach:

  • The naming of UnsafeAtomic does not make the pointer-like nature of the type apparent.
  • The UnsafeAtomic type is proposed to have create and destroy methods, which combine dynamic allocation of storage and initialization for an atomic value. The convenience of these APIs suggests they may be the primary intended API for working with the type, further adding to confusion about the nature of the type, and adding indirection that is likely to be an unacceptable overhead for the sorts of performance-bound applications that require atomics.
  • Since the requirements of the AtomicProtocol are hidden, it isn't clear what guarantees it provides for generic code, or how generic code should work with types constrained by the protocol. Without further implementation experience it also isn't clear that the implementation-detail protocol is adequate for future expansion to double-wide types or other extensions in the future.

Other design paradigms that were explored in the review and pitch thread include:

  • Exposing the atomic operations as free functions, static methods, or instance methods taking values of the existing Unsafe*Pointer types. This makes the pointer-based nature of the operations clear, but also invites mixing and matching atomic and nonatomic operations on the same pointer.
  • Creating a safe class AtomicReference<T> that manages the lifetime of the atomic storage. This would present a mostly safe interface to atomics, since classes allow shared references to a common resource and provide the necessary lifetime management facilities via deinit to manage atomic storage. However, classes would have the problem of added indirection, along with potential added overhead and interference from ARC reference counting operations, making this an inefficient approach.

All of these approaches have serious shortcomings, and they're likely to be quickly superseded by move-only atomics in the future. How far that future is from the present is an open question, and having some sort of low-level API for atomic primitives could still have a small niche for advanced users even with move-only atomics as the primary interface. The core team will keep an eye on the package ecosystem to see what's needed and what works well in practice in this space.

Thanks to everyone who participated in the review!

23 Likes

I would like to discuss this point further (I had a draft post for this review but forgot to finish it :sweat_smile: it’s been a hectic couple of weeks, ok?).

Will there be another discussion thread soon, or is this a longer delay until more data is gathered/language features are implemented?

It will probably be a little while. Your best bet is likely to make a small sample implementation of anything you would like to be considered.

1 Like