Understanding Swift's value type thread safety

The resulting ‘let’ is still reading from memory that is being mutated concurrently. It may read off the end of the Array, or read a torn value that is mid-update.

This all stems from the fact that the same memory was held as a ’var’ elsewhere in the program, which is the point I was making. Just because your thread holds a ‘let’, doesn’t mean it’s safe.

Yes but you created the let by violating thread safety, of course it's going to be horribly broken. I just think that if you violate thread safety to create an object, you can throw all thread-safety assumptions that you normally would make about that object out the window.

If you don't agree, then we can formulate the rule as "any let of a value type that was created without violating thread safety is safe to access from multiple threads at once"

1 Like

You don’t need to violate thread safety to create anything.

How else would you have a let sharing memory with a var that's being modified?

var under modification: Write
Creating let: Read
If both of those happen from separate threads, that's a thread safety violation, right?
If both of those happen from the same thread, that's a violation of the law of exclusivity that should be caught by the runtime enforcement

1 Like

@Karl, actually, if you follow the "Swift rules" COW should provide atomicity.

Any thread that attempts to write will first copy and hence modify separate memory. (Edit: Except the last one, of course :)

If you have a way to break this, please share an example :+1:

1 Like

Not necessarily, isKnownUniquelyReferenced doesn't handle race.

Further related thread: isKnownUniquelyReferenced thread safety

3 Likes

Not necessarily, isKnownUniquelyReferenced doesn't handle race.

But then you are not following the "Swift rules".

If you have two values that reference the same object, then a race on the references inside the values are OK, as they are different references to the same object. So they contain the same bits, but they are separate references.

So isKnownUniquelyReferenced(&a.ref) racing with isKnownUniquelyReferenced(&b.ref) is all fine, but isKnownUniquelyReferenced(&a.ref) racing with isKnownUniquelyReferenced(&a.ref) is of course not, and would violate the exclusivity rule.

As long as you follow the simple Rule of Thumb that I outlined above, isKnownUniquelyReferenced() can be considered atomic (with the documented exceptions like weak, and possibly returning false even if unique).

This really helps crystallize the mechanics of these scenarios. It's unfortunate how easy it is to (silently) lose the safety that value types provide when handing them off to an escaping closure. It makes sense that it behaves this way but the lack of explicit signalling required is a potential footgun. shared/__block would help but I wonder if something at the point of use/capture would be even better (similar to explicit self in a closure).

Is a mutation in an escaping closure required to trigger the storage change or will any access also cause the change? I'm experimenting with reading in the closure and mutating in the function scope and thread sanitizer complains.

    // No crash
    // Thread sanitizer: Swift access race on store.append("\(i)")
    func testScenarioD1() throws {
        let iterations = 1_000_000
        var store = "hello"
        DispatchQueue.global().async {
            for _ in 1...iterations {
                _ = store
            }
        }
        for i in 1...iterations {
            store.append("\(i)")
        }
    }

Semantically, implicit captures (i.e., when you don't use an explicit capture list [store] at the top of the closure) always happen by reference to the named variable (even if the variable is never mutated in that closure). As a simple example:

func getFuncs() -> (increment: () -> Void, print: () -> Void) {
    var x = 0
    return ({ x += 1 }, { Swift.print(x) })
}

let (increment, print) = getFuncs()

print() // 0
increment()
increment()
print() // 2

Even though the closure returned as print does not modify x, it must reference some sort of boxed storage because increment does modify x, and the implicit capture of x in print must semantically happen by reference.

I would hazard a guess that in some cases where the compiler can prove that a captured variable does not actually get modified, or that a closure does not actually escape, you may not actually see the creation of a heap-allocated box in the compiled program, but only because the program can behave "as if" the capture did happen by reference without that overhead.

2 Likes

Unless I'm misunderstanding what you mean by static enforcement, the problem is we don't want to simply outlaw it. You need a way to say “I know what I'm doing; go ahead and make this thing shared.”

1 Like

Uh, careful; that could be interpreted two different ways.

To be more precise:

  • isKnownUniquelyReferenced does not prevent races on a single instance holding a reference to shared storage, which are still illegal.
  • Properly used, it does prevent races on shared storage from occurring due to concurrent access to distinct instances sharing that storage.

i.e. what they said.

Am I missing some context needed to understand what you're saying here?

In swift, an instance declared with let is always truly immutable. I don't know what you mean by “a read-only handle to Array storage”; the only way to get access that presents itself as read-only while allowing writes from other code in Swift is via unsafe constructs like withUnsafeBufferPointer or implicit conversion of &someArray to UnsafePointer.

3 Likes

Quotation of the week:

the lack of explicit signalling required is a potential footgun. shared / __block would help but I wonder if something at the point of use/capture would be even better (similar to explicit self in a closure).

I thought of reasons why that might be better (e.g. [shared store] in the capture list) before my coffee but now I can't retrieve them. :wink: Why do you think so?

Downsides:

  • When there are multiple closures in the function body, the variable is boxed once per declaration site, not once per closure. Not sure what it would mean to notate it just one of two closures.
  • That's a lot of syntactic weight to add to the closure, especially xN closures.

Of course I'd also like all global and static vars to be marked shared (0.5 x :wink:).

Yes, good point any access causes the change, thanks for asking.

You can of course do that with by explicitly capturing the variable, which creates a distinct copy, but then of course the value is not shared with the function body:

        DispatchQueue.global().async {  [store] in // <=== HERE
            for _ in 1...iterations {
                _ = store
            }
        }

or, simplified:

var store = 0
let f = { [store] in store }
store += 1
print(f()) // prints 0

If you actually want to share and mutate the value across concurrent computations, as others have pointed out, you need synchronization.

1 Like

In general I meant that the compiler should be able to determine when such an attribute would be needed, similar to @escaping (or __block I think). More specifically for the thread safety story, and depending on what the attribute is really for (indication to the user or compiler?), static race analysis could help provide a better indication of when the captured value is actually mutated and whether it was done concurrently. Additionally, I think would make more sense as a capture attribute rather than a declaration attribute.

Agreed. I wasn't proposing introducing a totally optional keyword that has no effect on anything, so I hope you can appreciate why I might have been confused.

Additionally, I think would make more sense as a capture attribute rather than a declaration attribute.

Why? Please see my questions about that here.

I was thinking it would be more specific than every shared reference, but if you're proposing it's needed even to read the value in a closure, that seems very annoying. Given the prevalence of closures in Swift API design, wouldn't that become rather verbose? Though I suppose it would be less painful if most escaping closures are eventually replaced with async functions.

I'm proposing that for a var to be accessed from an escaping closure (or escaping local function), it needs to be declared shared var. I think that's a very small price to pay for the increased clarity.

2 Likes

I was thinking it would be preferable to have some indication closer to point of use since that is where the danger is ("locality of danger"? :slight_smile:). Similar to explicit self in closures ("are you sure you want to retain self here?") or force unwrapping ("are you sure this isn't nil here?"), it'd be nice to have something asking "are you sure it's safe to access this here?" when the storage is shared since it is no longer guaranteed to be safe.

var store: Int = 0
DispatchQueue.concurrentPerform(iterations: 1_000_000) { i in
    store = i // Error/Warning: Potentially unsafe assignment to `store`.
    _ = store // Error/Warning: Potentially unsafe access of `store`.
}

And the warning/error would be squelched with some acknowledgement (e.g. parentheses around shared var in this example). Though it may be preferable to move the acknowledgement to the capture list somehow (e.g. [shared store]).

var store: Int = 0
DispatchQueue.concurrentPerform(iterations: 1_000_000) { i in
    (store) = i // No warning. Still unsafe.
    _ = (store) // No warning. Still unsafe.
}

I may be misunderstanding what you mean but I wasn't thinking of a capture list annotation as dictating the storage behaviour, but instead granting access to the storage. The closure without the annotation wouldn't allow (or would warn on) access to shared storage.

Yeah, but here is everywhere. If there's a race, it's just as unsafe inside as outside the closure. IMO-ly, acknowledging every use would make usage pretty awful, and parentheses are too implicit an acknowledgement.

Yeah, but when you have to repeat the same annotation over and over for essentially the same reasons, it just becomes annoying noise. Doesn't seem (to me) like a good user experience.

Yes that's true. Limiting the acknowledgement to escaping closures would at least help identify the causes(s) of the potential danger. The accesses outside of the closure are still unsafe (and unacknowledged) but any race would involve at least one access that was acknowledged as unsafe. Not great but a bit better.

Fair enough. Just for illustration purposes :slight_smile:

Absolutely. I'd be interested to know how frequently these annotations would be required in existing code, and how many of those cases already use synchronization to access the values safely. Perhaps it wouldn't end up being too noisy.

It would be irritating to have to acknowledge access when steps have already been taken to synchronize access, though. Ideally the compiler would refrain from requiring acknowledgement when safety could be guaranteed (e.g. when an escaping closure is guaranteed to only be invoked on the same queue as the rest of the accesses, etc).

Why not use the inout keyword on the closure argument, just like we do on function arguments? It is the exact same situation, is it not?