SE-0410: Atomics

Are you suggesting the proposal add a new public attribute, something like @noVarOnlyLet? It seems to me that for the time being the proposal can retrofit the current attribute it uses to guarantee address only-ness, @_rawLayout, to include the no var only let semantics. So the proposal could then mention that it would actively disallow declaring vars with type Atomic so we can leave design space and iteration on the public attribute in a separate proposal. I think formalizing @_rawLayout itself would be a better future goal that would allow folks to make these kinds of types outside of the standard library. In fact, I think using this attribute on a new type, something akin to UnsafeCell in Rust, would be the general purpose tool folks could use.

Right, I think these cases are the ones where var is actively hurting these kinds of types. A local var that's not captured would be fine, but it would make things like reassignment possible for atomics which is 1. not an atomic operation and 2. quite weird maybe. (Similar for vars as struct stored properties)

func something() {
  // No dynamic exclusivity checking here
  var x = Atomic<Int>(0)

  // Not an atomic operation
  x = Atomic<Int>(123)
}

So I think it would beneficial to ban all uses of var for these kinds of types regardless if there is dynamic exclusivity checking included or not. We can still achieve the safe "tearing" operations when moving these things during consumption as lets.

2 Likes

It does not make sense to me to do this as a separate proposal. One way to describe the behavior of that attribute would be “make this type act like the atomic types do”, but we’ll already have decided how the atomic types work in this proposal, so that future proposal would be, what, just picking a name?

Or to put it another way, you’re already describing several behaviors for these new types that are unlike any other types, and this seems like the most appropriate time to have a discussion of those behaviors and how they’re exposed in the language.

2 Likes

Is there any interest in implementing the atomic wait/notify_one/notify_all primitives? Seems like they might be useful tools to build condition locks.

Those aren’t atomics, they are synchronization primitives that require kernel support.

On top of that, even when we do add synchronization primitives like locks, I hold out hope that we can avoid adding traditional condition variables, which have some nasty properties.

2 Likes

This would imply to me either we disable var somehow entirely for these declarations, or we declare atomic variables via atomic myVariableName rather than let myVariableName. The latter would be more obvious but also more destructive to the point of being overkill.

Could we explain that Atomic and similar types have "reference semantics"? This would follow the precedent of ReferenceWritableKeyPath, which uses "reference" to describe the ability to non-exclusively mutate something.

As a tangent, I think interior mutability for noncopyable types need not be particularly unusual or low-level in the future. Shared mutable state is quite common, like with classes. Noncopyable types would just guarantee that the shared mutable state has a bounded lifetime and doesn't require heap allocation. We could embrace opt-in "reference semantics" for noncopyable types in the future. Maybe we could add a noncopyable @ReferenceWritable property wrapper, or class A: ~Copyable syntax, or something.

I think if we explain that Atomic has "reference semantics", we should be good. We have a similar situation with classes, where we do allow vars, but disallow mutating methods. Some people have complained about the latter restriction and have used protocol extensions to work around it.

A var atomic could have some limited use cases, such as if we statically know that nothing is borrowing an atomic, and we want to change its value without incurring the potential performance cost of an atomic operation.

We already emit a compiler warning when a var can be turned into a let. Maybe we can also emit a compiler warning if a non-public mutating method never mutates the instance.

2 Likes

That’s at odds with a very important feature of atomics, which is that their storage is inline.

Not entirely; they still have reference semantics. Because they are not copyable, you can only access them via borrows (which are references).

Of course, this is the only way atomics make sense -- they must be a kind of shared mutable state (i.e. have reference semantics), otherwise there is no point to atomic operations because there will never be concurrent accesses.

If we consider a mutable Int property of a class, we don’t say the Int itself has reference semantics.

We need better terminology here, because there is such a thing as a ManagedAtomic, which does have reference semantics.

1 Like

Many of the semantic differences between value and reference types disappear for non-copyable types. A uniquely-held reference has all of the local-reasoning properties we associate with a value type; that’s part of why copy-on-write works, as well as the core insight behind exclusivity, Rust ownership, etc. So on some level, sure, we could say that atomics have reference semantics and that the inline-ness of the storage is an implementation detail that we could expect only experts to understand.

Where the reference-type idea falls apart is that you can have mutable variables that hold references, and you can assign new references into those variables. You can come up with ways that this makes sense for atomic types, actually — assignment had to be done with exclusive access to a variable, so this “reference” assignment replaces the value in the atomic at a moment when you know it’s not being concurrently accessed. The problem is that that readmits the exclusivity-tracking problem; we still don’t want vars of these types in most of the contexts we’d like to use them.

3 Likes

atomics are Weird, they’re an advanced feature and i think it’s okay if they don’t fit neatly into the introductory value vs. reference semantics dichotomy. i feel like they are closer to being references than values, so i’m okay with saying they have “reference semantics”.

1 Like

Fantastic proposal, very thorough and I support it wholeheartedly. Props to the authors!

Just one question: several examples in the proposal include a compareExchange that is retried in a while loop (as it usually should) but I’d expect some sort of yield operation at each iteration and I’m seeing none. There is also nothing proposed to that end — is this considered beyond the scope of the proposal?

1 Like

Back-off (such as yielding) is important when your attempts to perform an operation can interfere with other threads' ability to make progress. Compare-exchanges do cause contention in the memory subsystem that can stall other compare-exchanges for a few cycles, but in typical lock-free algorithms, the expectation is that this contention will always rapidly clear because (1) at least one of the threads should succeed and (2) failure on any thread should set up that thread to succeed on its next attempt, unless another thread succeeds first again. As a result, there's not really any point in adding back-off.

Now, that caveat is important — the fact that other threads might repeatedly succeed over ours is a failure in basic fairness. This is generally considered acceptable for three reasons:

  • As far as I know, you can't really fix unfairness while keeping a lock-free algorithm; you need to switch to a fair lock.
  • Fair locks are generally a lot more expensive than unfair locks, and much more expensive than lock-free algorithms. The gains from a lock-free algorithm are generally more than enough to pay for a few threads starving for a few cycles.
  • If you have a single atomic that's so heavily contended that some threads are completely unable to make progress, something is seriously broken about your design.

But simple back-off strategies like "back off after failure" don't really help with fairness anyway, because it's the other threads you need to make back off.

This all just applies to lock-free algorithms that follow the typical pattern of a load followed by a compare-exchange loop that builds a new value based on the last observed value. Back-off is important for something like a spin lock because it doesn't follow those two rules above. In particular, failing to acquire a spin lock doesn't set the failing thread up to succeed on its next attempt, so a thread failing to acquire a naive spin lock can definitely throw up enough contention to interfere with an attempt to unlock. But implementing a good spin lock is a very tricky business and tends to be very environment-specific; I don't think there's a useful way to offer a general yield function that would help people to implement portable spin locks. And frankly I'm not sure we really want people to offer portable spin lock libraries anyway, because in most environments you should probably not be using a spin lock.

13 Likes

Thank you for the extensive reply @John_McCall! Understood.

Beyond fairness though, when a thread does indeed starve, does it matter in practice what it’s doing? In other words, is spinning considered harmful and should be avoided, or it just doesn’t matter in practice?

1 Like

As a general rule, yeah, I’d say spinning for what you might reasonably expect to be a long time is never great. It’s wasted energy if nothing else. Sometimes you don’t have any good alternatives because e.g. you’re implementing the thread scheduler; but short of that, there are better alternatives, and you should use them.

3 Likes

On yield (i.e. POSIX's sched_yield), Linus Torvalds has a (well, fairly aggressive TBH, but also informative) piece on it:

The problem with that is "yield" is pretty much undefined. The definition of it is literally about single queue of a real-time behavior with a real-time scheduler with priorities.

But that "definition" has almost nothing to do with actual usage. There are various random people who use it, and some might use it for locking, while some use it for other things.

What you want to use it for is "schedule the right process". But you don't even know what the right process is, or if you do you don't tell the system (because sched_yield() literally doesn't have that interface), so the kernel has to guess.

In some cases, "sched_yield()" is basically used by processes that say "I'm CPU-intensive, but I'm not important for latency, and I don't want to cause problems for others". You'll find various random GUI programs doing that because they are threaded, and one thread does things like update the screen, while another thread does calculations. The calculation loop (which is still important, just not latency-critical) might have "sched_yield() in it as a cheap way of saying "maybe there's a more important UI event going on".

In other cases, it's a performance optimization, where somebody says "I have done my part of the work and written it out, now I'm yielding because there's another user that is likely to want to use it, and that other user is actually the more heavy CPU hog and should run before I start generating more data for it".

And in others, it's because they are actually using real-time scheduling - perhaps on dedicated CPU doing the true Hard real-time kinds of things - and depend on the FIFO definition within their priority group.

That's the defined usage, but even that is really defined only in the traditional UP sense of "there is one process running at a time, and we have one single queue of them".

End result: sched_yield() is basically historical garbage. It's often literally wrong even for the defined usage because times have moved on from that whole "we run one thing at a time" long long ago. If your RT system actually has multiple concurrent threads (as opposed to being limited to a dedicated CPU) it's not really all that well-defined even there. But at least there you can still pretend it is.

Good locking simply needs to be more directed than what "sched_yield()" can ever give you outside of a UP system without caches. It needs to actively tell the system what you're yielding to (and optimally it would also tell the system about whether you care about fairness/latency or not - a lot of loads don't).

But that's not "sched_yield()" - that's something different. It's generally something like std::mutex, pthread_mutex_lock(), or perhaps a tuned thing that uses an OS-specific facility like "futex", where you do the nonblocking (and non-contended) case in user space using a shared memory location, but when you get contention you tell the OS what you're waiting for (and what you're waking up).

9 Likes

The really unfortunate thing about spinlocks, and the reason they're pretty much entirely avoided in userspace on Darwin-based OSs, is the combination of spinning and lacking priority inversion avoidance support. This means that if several high priority threads are waiting on the lock, they may keep a low priority thread from being scheduled to release the very lock that they're waiting for, since from the scheduler's point of view they're extremely busy doing higher priority work.

Luckily, in my experience at least, very few userspace spinlock users actually cared specifically about spinning under contention; they just wanted either fast uncontended locking, unfair locking, or both. Newer locks like OSAllocatedUnfairLock provide both of those properties without the downsides.

5 Likes

It's not just Darwin-based OS's, Linus Torvald's has an even more incendiary rant about spin-locks, and says, essentially, that they should never be done in user-space, period, end-of-story. He's obviously speaking about Linux here, but I think his arguments apply to most (if not all) OSs.

6 Likes

Thanks for the review feedback, everyone! The language steering group has decided to return this proposal for revision based on some of the topics raised in this review thread.

4 Likes