isKnownUniquelyReferenced thread safety

I'd like clarification on the documentation for isKnownUniquelyReferenced. It says:

If the instance passed as object is being accessed by multiple threads simultaneously, this function may still return true . Therefore, you must only call this function from mutating methods with appropriate thread synchronization. That will ensure that isKnownUniquelyReferenced(_:) only returns true when there is really one accessor, or when there is a race condition, which is already undefined behavior.

I don't really understand what this means.


I followed this Twitter conversation (Feb 2019) between Drew McCormack and @Joe_Groff (also nicely summarized by @mjtsai), in which Joe seems to say that isKnownUniquelyReferenced is thread-safe because it takes its argument inout and the exclusivity guarantee associated with inout:

isUnique takes its argument inout intentionally to ensure this isn't a problem. Swift's inout requires exclusive access to the memory passed in, so by the time you have a local copy, it must be in a separate memory location with its own strong reference

And:

In other words, because of the inout exclusivity guarantee, isUnique returning true also implies that your thread is the only thread that can see the one outstanding reference

@Joe_Groff I did not understand what you meant by this until I started writing this post. My original interpretation was that if two threads access pass the same reference to isKnown… at the same time, it would be an exclusivity violation, but that can't be because that's a situation what isKnown… should definitely handle.

I now think I confounded the variables containing the reference and the actual reference. New interpretation: as long as the two threads use separate variables to pass to isKnown…, everything is fine because memory exclusivity applies to variables (i.e. locations in memory). That both variables contain a pointer to the same object is unrelated. Is this interpretation correct?


Having established (I hope) that isKnown… is thread-safe, I still wonder what the documentation comment I quoted above hints at. Does it talk about the situation when two threads call isKnown... simultaneously with the the same variable (i.e. a race condition on the variable)? If so, this is a programmer error that I understand.

When the documentation talks about "the instance passed as object", does it refer to the variable or the object the variable points to?


PS: I also found this Dec 2017 thread that essentially asks the same question: Data races with copy-on-write. But the "bug" reported in that thread (SR-6543) turned out to be a thread sanitizer issue.

1 Like

You're right, this comment is confusing. I don't know what it means either, but I think I know what it's getting at.

If two threads have unsynchonized access to the same reference, then they could both confirm uniqueness simultaneously and proceed to mutate the same referenced class instance. That would be a race condition on the variable holding the reference. If each thread accesses the object via separate variables, then they each have a copy of the reference and there is no race.

In the reported bug, SR-6543 that race would happen if the dispatched code blocks all refered to the same variable.

Properly synchronized, shared access to a CoW reference would look like this:

T1: confirm uniqueneess
T1: mutate buffer
T1: release-memory-fence
T2: acquire-memory-fence
T2: confirm uniqueneess
T2: mutate buffer

Mutation in T1 all "happens before" mutation in T2.


Regarding 'let' vs. 'var', the only difference is that you can't call mutating methods on a 'let' variable--you can't have a race without any modification at the source level.

Regarding inout and exclusivity, I don't see any connection. Except as follows...

[EDIT] The @inout convention (along with a special compiler Builtin) prevents the optimizer from reusing the same reference for other variables it could otherwise prove reference the same object--that optimization would defeat CoW. (thanks @Joe_Groff).

[EDIT2] There is one other reason that the reference needs to be passed @inout. As John alluded to below, TSAN needs to see isKnownUniquelyReferenced as a mutation, otherwise the follow race would not appear to be a race:

T1: isKnownUniquelyReferenced(o)
T2: retain(o)
T2: release-fence(o)
T1: acquire-fence(o)
T1: mutate(o)

2 Likes

This wouldn't do what the check is intended for, though, since it would potentially let you modify a value that's passed "by value" at +0 to you if there happens to be only one reference to it. It's really amIKnownToOwnTheUniqueReference. If you received an argument @owned, you could conceivably ask the question of that reference, since you'd have ownership of the one remaining reference at that point.

1 Like

It's also worth noting that just as isKnownUniquelyReferenced does not take weak and unowned references into account, it cannot check for addresses that have "escaped" into UnsafePointers (or Unmanaged). The latter doesn't have "unsafe" in its name, so it's probably worth documenting that too.

3 Likes

isKnownUniquelyReferenced tells you whether the given variable holds a non-nil and unique reference to its current referent object.

isKnownUniquelyReferenced requires, as a precondition, that there be no racing modifications of the variable. That precondition does not need to be repeated in the documentation because it is true for every inout argument in Swift: as one aspect of the general prohibition on data races, if a variable is passed as an inout argument, then all other accesses to it must either happen-before the start of the call or happen-after the end of the call. Swift enforces that rule up to a certain point by default, and it enforces it a few steps further under TSan (or a few steps less if you turn off exclusivity checking), but ultimately it is your responsibility as a programmer to satisfy it. It is a goal of the eventual concurrency design of Swift that code which obeys certain rules globally will be guaranteed to not have data races, but we don't have that design yet, and even when we do, I don't foresee us ever completely banning raw uses of threads.

isKnownUniquelyReferenced does not need to be ordered with accesses to other variables that hold references to the variable's referent. If there are no races on the given variable, and there are no weak or unowned references to the variable's current referent, then if isKnownUniquelyReferenced returns true it is guaranteed that all accesses to the referent must be through the given variable. The converse is of course not true: if isKnownUniquelyReferenced returns false, it is still possible that there will be no further accesses to the referent through other references — for any number of reasons, races being probably the least significant.

isKnownUniquelyReferenced must take its argument inout because that is the only way in Swift today to abstractly refer to a variable. Other arguments are always semantically values, and "is this value the only reference to this object?" isn't a good question to ask because the answer is useless unless you've got another reference to the object that you can use afterwards, in which case presumably the answer to the question ought to be "no, because you've got another reference right over there". That's why the question has to be about whether a specific variable is the only way to refer to the object. Now, isKnownUniquelyReferenced could take an immutable reference to a variable, if we had such a thing, but we don't.

16 Likes

@John_McCall Thanks a lot, this clears things up for me.

[Aside, now that the question has been answered…]

I don't think the immutable reference would actually be sufficient either (depending on how you interpret "immutable reference"). Consider a case where a variable is passed by value and by immutable reference to a function. Since retainable pointers* are passed at +0 now by default, a single owned variable could back both the by-value and by-immutable-reference arguments without any additional retains, and then isKnownUniquelyReferenced would incorrectly conclude that there's only one reference.

Actually, what keeps isKnownUniquelyReferenced from returning true in this case?

func test() {
  var variable: AnyObject = …
  assert(isKnownUniquelyReferenced(&variable))
  func helper(_ value: __shared AnyObject) {
    if isKnownUniquelyReferenced(&variable) {
      variable.foo = 1
      print(value.foo)
    }
  }
  helper(variable)
}

Is it just because we don't expose bare references to the innards of Array that we can get away with this?

* We'd normally call these "object references" but I don't want to overload "reference" in this paragraph.

This would require an extra retain, since consuming the +1 argument would potentially destroy the +0 argument otherwise.

Likewise, in order for the access to &variable inside helper to not be an exclusivity violation, variable would have to have been copied when passed in.

It's going to be passed as guaranteed…which means the caller is doing a borrow for the duration of the call, which means a copy was made just in case someone writes this very code. Okay, got it.

I agree with this one. For the one with a borrowed variable and a borrowed value, I think it'd be possible for us to make the semantics work that way, but I think in practice it would be bad policy: it would prevent us from eliminating a lot of copies and/or fabricating arguments from borrows of other values, both of which are very valuable optimizations. So I disagree with myself from earlier: while we could allow isKnownUniquelyReferenced to take an immutable reference to a variable, we don't actually want to make the implementation-level guarantees which would allow it to have sensible semantics if it did.

Right. We emit the copy, and in theory we can easily eliminate it in most cases, but this is one of the cases where we can't.

I think this one's still "incorrect". The by-value argument is +0 and that +0 can be backed by the same borrow as the "immutable reference" without violating any rules.

It's probably fine, though, since if you ask isKnownUniquelyReferenced at this point, you can't act on it until you're back to one mutable reference or until you yourself violate it by escaping a copy somewhere. No, this is nonsense; you modify things inside the reference. John said it better than me anyway.

More than that, it seems fundamentally impossible to me. This is a valid implementation of such a function:

func foo(a: __owned X, b: __shared X) {
  consumingUse(a) // destroys a
  borrowingUse(b) // we can still use b
}

If we passed the same argument to a and b without a copy, the consumingUse of the owned argument would invalidate the value while still being borrowed by b.

The lifetime of the value argument can be ended by consumption while the borrow is still valid.

You can't consume a +0 value; that itself requires a copy.

I agree with you, except that I don't think Jordan meant "owned argument" when he said "value argument".

1 Like

I may be misunderstanding you, then. When you say "by value" I assume you mean +1. For all existing types today, there's no semantic difference between passing a +0 by value or by reference.

1 Like

The semantic difference between "+0 pass-by-value" and "pass-by-reference" is that the following code prints "5" vs. "6":

func f(sh: __shared Int, f: () -> ()) {
  f()
  print(sh)
}

var x = 5
let f = { x += 1 }
f(sh: x, f: f)

I don't know what "pass-by-reference" means to you, but I think almost everyone would consider it equivalent to C++ 'const &' in this context.

You can't claim that it's really "pass-by-reference-with-implicit-copy" because that's the same thing as saying "pass-by-value".

I think most of the posts above would be avoided if people would realize that our misnamed __shared keyword does not mean that the argument is "shared", nor does it mean that it is "borrowed".

It is still passed-by-value using an "unowned" or "nonconsuming" calling convention which makes it ABI compatible with an argument that happens to be a "borrowed" variable, when such a thing exists:

1 Like

I'm taking for granted that we'll maintain a world where reference arguments are always governed by ownership, so pass-by-reference means const &__restrict__, and your example either doesn't compile or implicitly copies. I just misunderstood Jordan, I'm not trying to relitigate the discussion we had in that previous thread.