ARC(Automatic Reference Counting) is thread-safe? I am curious if ARC is thread-safe. In Rust, it is thread-safe.
Yes, ARC is thread safe and ever was.
It depends on what you mean, actually. This kind of code will crash trying to decrease a reference counter on a deallocated entity. And the crashing release_dealloc call is generated by ARC.
That example is non-atomically reading and writing the variable val
concurrently, which is a low-level data race that has nothing to do with ARC.
This is a key difference between ARC and GC — in a Java VM, if you null some reference and another thread looks at it, it will either observe a valid object pointer there or null, and there’s no race. This kind of thing might still have undesirable consequences if the stores are not ordered with other things, but it won’t violate memory safety.
ARC is thread-safe once you assume exclusive access to mutable state, which motivates the whole concurrency model. Independent of ARC, Swift also must enforce this just because values can have arbitrary size, too.
There's lots of ways this code could go wrong, but the most straightforward I could imagine is that the blocks which set the value to nil
are implemented as multiple operations:
$0 = [load val]
[store val = nil]
swift_release($0)
And the problem is that the load-then-store is not atomic, so it's possible multiple blocks load the same address from val
and over-release it.
You could make it atomic by using an exchange instead, but of course you need to actually tell the compiler to do that. Would a GC still manage to handle that?
In the OP’s example, the shared location is written to but never read (semantically). In a GC, this example would generate unreachable objects which are periodically reclaimed. There’s nothing that can be atomic.
In general with a GC you can just load an object pointer and it won’t possibly get freed until the GC runs, at which point your thread is in some known state where the roots can be traced. I imagine atomic operations on GC references work like ordinary values, except you have to run the read and write barrier if you have one (but those are not atomic).
With ARC, the general idea for implementing a shared mutable reference without locking on the read side, is that you need to postpone the deallocation, and then check for a zero reference count after loading the reference; in this case you treat it as if it were nil. Once all threads that were in the middle of the check have advanced past that point you can free the memory.
I see ARC-related stuff at the point of crash (release, refCount):
> bt
* thread #2, queue = 'com.apple.root.default-qos', stop reason = EXC_BAD_ACCESS (code=1, address=0xbeaddead8000)
frame #0: 0x000000019bf73370 libswiftCore.dylib`_swift_release_dealloc + 32
frame #1: 0x000000019bf74108 libswiftCore.dylib`bool swift::RefCounts<swift::RefCountBitsT<(swift::RefCountInlinedness)1>>::doDecrementSlow<(swift::PerformDeinit)1>(swift::RefCountBitsT<(swift::RefCountInlinedness)1>, unsigned int) + 132
frame #2: 0x0000000100004cc8 Cons`Concurrency.val.setter at <compiler-generated>:0
* frame #3: 0x00000001000052b0 Cons`closure #1 in Concurrency.start(self=0x0000600000203680) at main.swift:49:26
frame #4: 0x00000001000052ec Cons`thunk for @escaping @callee_guaranteed @Sendable () -> () at <compiler-generated>:0
and I wasn't able making it crash without involving reference types. Looks very ARC related to me.
Do you mean using some new swift_exchange
that handles this situation correctly?
retain
and release
called themselves are "atomic", it's just that we have a sequence like
which is not, but it looks it could be fixed by introducing some new swift_exchange
aggregate call that will do this sequence atomically.
Atomicity is a property that must be satisfied across the set of all possible operations. Assignments of ref-counted object references can indeed be done locklessly with a simple atomic exchange, but there is no matching algorithm for loads that works correctly with that, at least not with standard atomic primitives. (You can do it with even a very small amount of transactional memory, though.)
Tracing GCs use several different techniques to solve that, and it's a very active research area. There are several related problems they also have to solve in order to achieve the high-level goal of memory soundness even in the face of improperly-ordered accesses by the program. I much prefer the Swift/Rust approach of just ruling that out statically.
This seems to work ok (ignoring the warning):
// helper
func exchange<T>(_ address: UnsafeMutablePointer<T?>, _ new: T?) {
let address = unsafeBitCast(address, to: UnsafeMutablePointer<Int>.self)
// Warning: 'unsafeBitCast' from 'UnsafeMutablePointer<T?>' to 'UnsafeMutablePointer<Int>'
// changes pointee type and may lead to undefined behavior; use the 'withMemoryRebound'
// method on 'UnsafeMutablePointer<T?>' to rebind the type of memory
let new = unsafeBitCast(new, to: Int.self)
if new != 0 { myretain(new) }
let old = myexchange(address, new)
if old != 0 { myrelease(old) }
}
// BridgingHeader.h
long myexchange(long * _Nonnull address, long new);
void myretain(long value);
void myrelease(long value);
// CFile.m
#include <stdatomic.h>
extern void swift_retain(long);
extern void swift_release(long);
long myexchange(_Atomic(long) * address, long new) {
long old;
while (true) {
old = *address;
bool done = atomic_compare_exchange_strong(address, &old, new);
if (done) break;
}
return old;
}
void myretain(long value) {
swift_retain(value);
}
void myrelease(long value) {
swift_release(value);
}
// example
class Foo {}
let naiveCode = false
final class Concurrency {
var val: Foo?
func start() {
DispatchQueue.global().async {
while true {
if naiveCode {
self.val = Foo()
} else {
exchange(&self.val, Foo())
}
}
}
DispatchQueue.global().async {
while true {
if naiveCode {
self.val = nil
} else {
exchange(&self.val, nil)
}
}
}
}
}
print("start")
var c = Concurrency()
c.start()
RunLoop.current.run(until: .distantFuture)
with naiveCode
variable set to true this example crashes almost immediately. Set it to false and it seems to work correctly (no crashes, memory consumption doesn't grow). Lot's of cast trickery to get ARC out of the way at first and redo it again by low level primitives.
There is a much cheaper implementation available if the only operation you have to support is a store.
Please point us to that.
Could you also share your opinion: if this is something Swift could fix or not as "nothing is broken"?
I'm on the fence:
-
On one hand "crashing is bad" and half of me saying "the above fragment should work out of the box without crashing".
-
On another hand if we change from Foo to Bar:
struct Bar {
var field: Int
var otherField: Int
}
final class Concurrency {
var val: Bar
}
and change "val" from two different threads without protection, while it won't crash right away into your face ("full concurrency checking" debugging facility aside), you'd be able to see the effect of Bar written/read non atomically (e.g. half of it would be from one writeup and another from a different writeup), so in a way this is even worse than crashing as it's not immediately obvious, and another half of me saying: "there's nothing broken here to fix" and "you must use locks or whatever when reading and wringing the field, be it a reference Foo or a value Bar".
- On the third hand (sic): If I do need to protect read/write accesses to the reference (and non-reference) fields anyway.... why to bother having reference count accesses protected inside references' "retains"/"releases"?
Couldn't refcount adjustment be just a normal unprotected variable access?
The code is invalid and should be rejected (or maybe it’s more like UnsafePointer etc, where dispatch APIs cannot be annotated sufficiently to rule this out?), just like we reject static type errors instead of attempting to implement a runtime model which allows such code to execute.
No, because two threads can hold a reference to the same object in the heap. The only case that is unsafe is when two threads access a reference stored at the same location.
Something else is needed, as the following easily breaks the precondition. Here two locations are used pointing to the same reference object:
class Foo {
var a = 1
var b = -1
func foo() {
precondition(b == -a) // 💣
a += 1
b = -a
precondition(b == -a) // 💣
}
}
final class Concurrency {
var val1: Foo = Foo()
var val2: Foo!
func start() {
val2 = val1
DispatchQueue.global().async {
while true {
self.val1.foo()
}
}
DispatchQueue.global().async {
while true {
self.val2.foo()
}
}
}
}
print("start")
var c = Concurrency()
c.start()
RunLoop.current.run(until: .distantFuture)
Here, you're accessing the stored properties of the shared class instance, not just its reference count. No additional mutual exclusion is needed to share the reference, but if you want to maintain a counter or something inside the class instance, you'll need atomic operations there (or take a lock before accessing the contents).
Once again I think everything boils down to exclusive access to shared state, which is exactly what the Swift 6 concurrency model aims to enforce.