SE-0512: Document that Mutex.withLockIfAvailable(_:) cannot spuriously fail

Hello, Swift community.

The review of SE-0512: Document that Mutex.withLockIfAvailable(_:) cannot spuriously fail begins now and runs through March 2nd, 2026.

Reviews are an important part of the Swift evolution process. All review feedback should be either on this forum thread or, if you would like to keep your feedback private, directly to me as the review manager via either email or forum DM. When messaging me directly, please put "[SE-0512]" at the start of the subject line.

Trying it out

Since this is just documenting the current behavior, there's no need for a new toolchain for testing it.

What goes into a review?

The goal of the review process is to improve the proposal under review through constructive criticism and, eventually, determine the direction of Swift. When writing your review, here are some questions you might want to answer in your review:

  • What is your evaluation of the proposal?
  • Is the problem being addressed significant enough to warrant a change to Swift?
  • Does this proposal fit well with the feel and direction of Swift?
  • If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?
  • How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

More information about the Swift evolution process is available in the Swift evolution repository.

With thanks in advance for your consideration,

John McCall
Review Manager

14 Likes

+1 yes, seems obvious.

But what I'm really here to say is, I really appreciate the attention to detail in this proposal. It

  • clearly explains the problem
  • clearly identifies the solution space
  • deeply investigates the trade-offs of the various solutions
  • clearly explains the state of the art in other languages
  • explores the edge-cases

I wish more SE-proposals held themselves to this standard.

15 Likes

Speaking for myself (i.e. not as the review manager), the proposal makes a reasonable case for this change, but I believe there's an even stronger argument for it. In particular, I think the C and C++ committees were wrong to standardize a weak tryLock operation.

The only reason to specify a weak tryLock is to let it be implemented with a weak compare-exchange (CE). When both are directly implemented in hardware with similar ordering requirements, weak CEs are not generally faster than strong CEs; the theoretical benefit of asking for a weak CE is just this:

  • on architectures where strong CEs must be emulated with weak CEs, and therefore implementing a strong CE requires a loop,[1]
  • given that a CE generally already needs to be in a loop in order to implement the transactional semantics of whatever operation it's part of,
  • a weak CE avoids the nested loop that would be necessary if a strong CE were used instead.

Using a weak CE is therefore a pretty minor and architecture-specific optimization to begin with, which is why the common advice (including Raymond Chen's, as cited in the proposal) is to use it sparingly and only when the retry operation of your transaction loop is cheap enough that it's better to just do it than to bother checking for spurious failure.

This makes the concept of a weak tryLock pretty weird. A programmer using tryLock is, presumably, doing so as part of a transactional operation of some sort. A transactional operation should generally not fail spuriously, so the code must eventually try again if tryLock spuriously fails. So far, this is similar to the situation with weak and strong CEs: there must be a loop that repeatedly tries to lock, and it might be a nice optimization to use a weak tryLock if the retry step of this loop is cheap. If the retry step is expensive, of course, it'd be better to repeat the lock attempt until it either succeeds or fails non-spuriously — which is to say, it'd be better to use a strong tryLock.

But the tryLock APIs standardized by C and C++ do not distinguish between failure modes! They just return false on any kind of failure, and since the atomic state of the mutex is encapsulated, there's no way for the caller to determine whether failure was spurious or not. Any operation that tries again for spurious failure must therefore also try again for non-spurious failure. If our transactional loop does this by just calling tryLock again, then it's just repeatedly calling tryLock until the mutex actually becomes available, which means we've degraded our mutex to a spin lock. So either we're misusing the mutex or the operation is doing something weirder than just repeatedly calling tryLock, like mixing the attempts with sleeps or lock calls. Our effort to find an operation that's justified in calling a weak tryLock is getting increasingly hairy.

Now, I can think of operations that do mix up lock attempts like that. For example, there are deadlock-free algorithms for acquiring multiple locks that are based on locking one of the locks, tryLocking the rest, and backing off when one of the latter attempts fails. However, that backoff step is very expensive, which is not the situation where we want to be using a weak CE and risking spurious failure. At best, we may want to optimistically use weak tryLock when probing locks, but we'd still want to use a strong tryLock to verify that a lock is actually taken before giving up. And you'd probably want to know immediately whether a weak tryLock failed for spurious or non-spurious reasons.

Meanwhile, a strong tryLock is not just a nice optimization when the retry step is expensive. A strong tryLock tells you that at least one other thread has tried to acquire the mutex, which is a useful piece of information. In some algorithms, that might be good enough to avoid acquiring the mutex yourself.

In summary:

  • Strong tryLock is useful in operations where the retry step is either unnecessary or expensive enough to want to avoid doing spuriously. This covers most operations that are using tryLock.
  • Many architectures gain no theoretical benefit from a weak tryLock because they provide a strong CE.
  • I can imagine legitimate uses for a weak tryLock, but only as a minor optimization that would be backed by calling a strong tryLock, and I'm skeptical that it would matter for the sorts of algorithms that would try it.
  • The uses I can imagine for a weak tryLock would all benefit from knowing whether failure was spurious or non-spurious, which the C and C++ APIs don't tell you.

My conclusion is that mutex APIs should always offer a strong tryLock. If they're going to also provide a weak tryLock (which I do not feel is sufficiently justified at this time), it should distinguish spurious and non-spurious failures to the caller. And the default tryLock that programmers reach for should certainly be a strong rather than a weak one. So the C and C++ APIs, which offer only a weak tryLock that does not distinguish failure modes, are misdesigned in multiple ways.

Applying this to Swift, Mutex.withLockIfAvailable(_:) should just perform a strong tryLock. That is exactly what is suggested by this proposal. If we ever find an important reason to add a weak tryLock, we can do that as a future refinement, with careful attention provided to the need to distinguish failure modes.


  1. On architectures like this, strong CEs are implemented by repeatedly performing a weak CE until it either succeeds or fails for non-spurious reasons. A failed CE still refreshes the current value of the atomic, so if your weak CE fails, you can easily check whether it failed spuriously by just comparing the new current value with the value you were trying to compare against. ↩︎

31 Likes

Excellent proposal and observation, thanks for the writeup @grynspan :slight_smile:

As someone having to code around the possibility of spurious failures in C++ in the past, yeah no-one wants this, especially not by accident. John added a lot of detail already, so all I can say that I'm very supportive of documenting and upholding this guarantee and we'd make sure we keep that promise once the docs are adjusted :+1:

7 Likes

Very well explained! Seems like a no brainer.

4 Likes

+1, strong proposal

2 Likes

I hesitated to say such a thing in the proposal document because, well, the C and C++ committee members are probably all a lot smarter than I am. :melting_face: But, yes, I think they made the wrong call here and that, if it was important to offer weak trylock interfaces, they should have been separate interfaces. std::compare_exchange_weak() and std::compare_exchange_strong() are explicitly different for the same underlying reason.

3 Likes

+1, strongly argued and well researched proposal. In general, it's worthwhile and important for API documentation to clearly communicate its behavioral guarantees, and that's even more critical for synchronization primitives. Since Swift Testing's use of this API was a primary motivator for the proposal, I'm especially supportive.

2 Likes

I believe the committee was expecting the canonical user of try_lock to be something like std::lock(…) ( [thread.lock.algorithm] ) which locks any number of mutexes without deadlock, and the “backoff” step there is not expensive. See Dining Philosophers Rebooted

try_lock is itself kind of a niche operation. I don’t think I’ve ever seen it in user code outside of implementations of std::lock, library test harnesses, or implementations of C++’s Cpp17Lockable concept.

Releasing all the locks you've successfully acquired and then waiting on a different lock — even without a back-off sleep, although I think that would be a good idea after repeated failures — is expensive. It would be foolish to do all that just because one of the try_locks failed spuriously because of a scheduling event. The benchmarking will of course not show that because the actual mutexes are platform mutexes, and as Jonathan shows in the proposal, nobody actually implements try_locks that fail spuriously.

Now, one use case I can imagine for a weak tryLock would in fact be in a std::lock implementation. You would lock one of the mutexes and then try to weak try_lock all the rest. You repeat that residual loop on the remaining mutexes until you stop making progress (i.e. successfully locking at least one new mutex). At that point, you make one last attempt to lock all of the mutexes, this time with a strong try_lock, and only if that again fails to make progress do you back all the way out and restart by locking one of the mutexes you failed to acquire. I do think it's quite unclear that a weak try_lock would be a practically useful optimization here, though. Testing for spurious failure is pretty cheap.

I think you're right that the committee was only thinking of std::lock, although I don't think that's much of a defense.

1 Like

I don’t think most people using withLockIfAvailable expects it to fail for other reasons than the mutex already being locked. I certainly don’t expect it. In that sense “no spurious failure” is already informally part of the contract, and documenting it is a good idea.

It’s like sort which was eventually documented to be stable because changing the implementation to something unstable would be unthinkable without risking breaking a lot of code in subtle and unpredictable ways.

5 Likes

A key reason C and C++ allow spurious failures in try_lock has nothing to do with weak compare-exchange. The problem appears when someone writes code like this where x is an ordinary non-atomic variable:

Thread 1:

x = 42;
lock(l);
...

Should the compiler be allowed to move the assignment of x into the critical section that starts with the lock? It's often a desirable compiler optimization; in fact, compilers sometimes do that (compilers moving unsynchronized code into a critical section is generally fine; moving code out of a critical section is definitely not fine). However, prohibiting spurious failures breaks this optimization. The problem arises if there is another thread that does the following:

Thread 2:

while (try_lock(l) == success) {
  unlock(l);
}
assert(x == 42);

Section 3 of Hans Boehm's paper on the rationale behind the C++ memory model explains this problem in detail and why C/C++ chose to let them spuriously fail.

I'm not sure what memory model Swift uses. The choices (assuming a program contains unsafe code that relies on mutexes for safety) are to either:

  1. Allow try_lock to fail.
  2. Define lock/unlock/try_lock as not always being sequentially consistent in the memory model to disallow code like this.

Either one works. Choice 2 has a significant usability cost, since almost everyone is taught that they are sequentially consistent.

I assume this is just a notation issue, but as written this would deadlock immediately. I think you may have copied the code in the paper incorrectly? (Not a criticism, just want to make sure we have maximum clarity).

This is indeed the case in Swift. In the example given, T1 assigns to x and T2 reads from it. This is effectively impossible to write in Swift 6. If value x is mutable, it is either:

  • Isolated to a single isolation domain (in which case T1 and T2 cannot simultaneously access it)
  • Atomic- or Mutex-guarded (in which case it is protected anyway)
  • Marked nonisolated(unsafe) or @unchecked Sendable (in which case the developer has explicitly opted out of concurrency safety)

If you attempted to implement the logic from the paper verbatim in Swift 6, the compiler would emit an error. So it's moot in Swift.

The same decision was made in other languages (as stated in the proposal)—C and C++ are the odd ones out. While neither John nor I would presume to know better than the C and C++ committees, we both feel they made the wrong decision here. But that's just an opinion, and not a condemnation. :slightly_smiling_face:

Edit: Coming back to this thread in the morning, I realized something: the original premise is flawed on its face (sorry Mr., Boehm!), because the compiler cannot reorder the write to x after the call to lock() anyway.

lock() is or makes an opaque library call, so the compiler can't see its implementation. Two different threads can access x, so we know it's not function-local and may be referenced by lock() or by something lock() calls (e.g. lock() might call the global operator new). Moving x = 42; after lock() may therefore affect the correctness of lock() itself and violates the "as-if" rule, precluding the optimization.

While Swift 6 does limit what users can write with the safe tools we offer, that doesn’t mean that everything else is invalid. It’s a matter for the memory model.

Swift hasn’t formalized a memory model here, but personally I am reticent to commit to sequential consistency for things like locks for exactly this sort of reason — it leads to a lot of extra fences for no compelling reason. Acquire/release ordering for entering/exiting critical regions seems good enough.

It seems to me that the committees could simply say that a failed attempt to acquire the lock doesn’t synchronize rather than creating the same formal property by permitting (even if “hopefully not”) actual spurious failures.

1 Like

Yes, you're right, the extra lock(l) in the example is a typo. The example should be:

while (try_lock(l) == success) {
  unlock(l);
}
assert(x == 42);

(I also edited the original post to fix it.)

In Swift the issue arises when a program contains nonisolated(unsafe) logic somewhere that uses mutexes for safety.

Concurrency safety is an orthogonal issue. It's nice if one can stay within the "safe" subset, but programs that contain nonisolated(unsafe) logic or interact with external languages do exist for a variety of reasons, and a compiler must compile them correctly. Unless a compiler can positively tell that try_lock is never used on a given lock in the entire program, the choice made here affects the compilation of lock and unlock of that lock, either via prohibiting optimizations or via adding extra fences.

I respectfully disagree that it's orthogonal. If you write nonisolated(unsafe) on a mutable value, you're opting out of concurrency safety, and if the result of doing that is that T2 asserts, well… that's what the word unsafe in there is ultimately for, innit?

Yes, Swift interacts with code written in other languages, especially C-family languages. Concurrent unsynchronized access of a mutable value is unsafe in those languages too. Perhaps the ultimate answer to Boehm's (implicit) question is not "make tryLock() worse in some way", it's just "don't do that"? :melting_face:

The confusion here seems to be between unsafe behavior and undefined behavior. They're very different.

Yes, Swift interacts with code written in other languages, especially C-family languages. Concurrent unsynchronized access of a mutable value is unsafe in those languages too. Perhaps the ultimate answer to Boehm's (implicit) question is not "make tryLock() worse in some way", it's just "don't do that"?

Concurrent unsynchronized access is undefined in C-family languages, not unsafe. Concurrent synchronized access may or may not be safe from a higher-level point of view, but it is defined.

If I understand Swift correctly, safe just means that the implementation is responsible for avoiding data races. At least in the context of this discussion, unsafe means that the user is responsible for manually synchronizing to avoid data races, and using locks is one way to perform such synchronization. As long as the user avoids data races, an unsafe program has defined behavior. It's not ok for a compiler or implementation to make a program do anything via "don't do that" just because it contains unsafe code — that would be undefined behavior.

Well, yes, fair. The terms aren't interchangeable. But assert(x == 42) has undefined behaviour in the original paper regardless of how tryLock() is implemented, because x is written to and read from without synchronization.

A mutex only guarantees sequential consistency for values it guards, and the mutex in the paper doesn't guard x (unless I've really misread it, always a possibility).

At the level of the memory model in C/C++, a mutex is just a mutex that inserts edges into the happens-before graph; there is no concept of a mutex "guarding" anything. One can define convenience classes that won't let you access certain variables unless you can hold the proper mutex, but that's just to help programmers write better code and not part of the memory model.

Sequential consistency for a set of events means that it's possible for all observers to agree on the order in which those events happened. For a mutex it applies not just to the values that a mutex guards but also to the mutex itself and means that one can put all lock, unlock, and try_lock operations of a given mutex into an order that all threads agree on.

I think we're disputing that a failed try_lock actually establishes an Acquire-Release pair with the thread that acquired the lock.

The success ordering of a lock acquire compare exchange should be Acquire, not Release or AcqRel. An Acquire RMW does not form a pair with a failed RMW.