Solving the mutating slice CoW problem

Andrew_Trick · April 14, 2020, 9:20am

Every time I mention a release, that obviously implies a matching retain.

I think you mean, “Using isUniquelyReferenced to avoid a copy only works because the compiler…”. A crucial element of isUniquelyReferenced “working” is that it reports no false positives, and that has to do with retains.

That's the opposite of what I meant. The only salient properties of isUniquelyReferenced that prevent the compiler from removing necessary copies (preventing false positives) are that isUniquelyReferenced may release the reference addressed by isUniquelyReferenced's inout argument, and isUniquelyReferenced can never be fully inlined (the builtin also takes an inout reference).

A crucial element of isUniquelyReferenced “working” is that it reports no false positives, and that has to do with retains.

Every time I mention a release, that obviously implies a retain.

The only semantic property an individual retain has today is that one must exist for each potential release, i.e. to provide an owned reference. As soon as we can determine a reference cannot be released within its lifetime, then the retain has no purpose. I think you want semantics on retains that have something to do with local variable scope.

The only way I can think of attaching additional semantics to a retain is by introducing a "forced copy" feature. In my previous response, I explained that the compiler could provide such a feature without much difficultly, but that using the feature in ArraySlice.init might have a performance penalty.

But that aside, I'm afraid I don't understand what you're getting with the rest of that sentence (“thinks it can release its argument”). Partly that's due to ambiguous writing: do “it” and “its” refer to the same thing ( isUniquelyReferenced ), or does “it” refer to the compiler?

isUniquelyReferenced may release the reference addressed by isUniquelyReferenced's inout argument, as far as the compiler is concerned.

Furthermore, isUniquelyReferenced can be implemented using ObjectIdentifier , which makes the issue of whether it can release its argument moot.

The burden of proof is on us when introducing an API to explain why it is correct in terms the compiler's model of language semantics. If we can't write down enforceable rules for SIL that the implementation must follow, then none of us can predict how some combination of implementation choices will violate our expectations. Nonetheless, your ObjectIdentifier example is interesting and educational:

There are two answers to why the ObjectIdentifier example works in practice, depending how much of the code the compiler can see at one time (depending on inlining).

Builtin.isUnique forces an in-memory copy of a reference by virtue of taking its address inout.

The same is true for any wrapper around Builtin.isUnique that takes the reference's address inout.

In this example, when everything is inlined, the fact that you're casting to and from an opaque pointer simply does not matter. The compiler sees through it.

Now, when Builtin.isUnique forces the lvalues x1, and x2 into memory locations, it effectively forces a copy. Relating that to your situation, if both the Array and ArraySlice are mutated, then your "Solving the mutating slice CoW implementation" does work fine in practice and in theory.

In the two incorrect scenarios that I pointed out, sliceCopy is not mutated, and is not forced into a distinct memory location.

Nested retains are not currently eliminated, even when they are blantantly redundant.

If we carefully control inlining so that everything except the non-mutating liveObjectIsUniquelyReferenced is inlined, then the ObjectIdentifier example becomes interesting.

In this case, x2 is no longer passed @inout to any opaque code. There's no reason for the implementation to generate a copy.

So why does the example still work today even with worst-case inlining? (I actually tried this.) The SIL optimizer is extraordinarily bad at removing redundant nested retains to the same object. All it takes is one escape or unknown use of the reference to prevent copy elimination. This inability to optimize seemingly obvious redundancy is a problem with the representation that will disappear in due time. So, when _liveObjectIsUniquelyReferenced is not inlined, there just isn't any fundamental reason that your ObjectIdentifier example works. And more importantly, I can't think of any SIL-level requirement that would preserve the current behavior short of "the compiler never removes nested copies of the same reference".

There's no formalism that allows the implementation to track the number of variables that may have referenced an object.

If you really did cite all the relevant formalisms, the formalisms are insufficient to guarantee that CoW works today, in at least the two ways that I've pointed out (and will repeat in bullets below).

Regardless, nothing I've done or written is predicated on tracking the number of variables that may have referenced an object, beyond assumptions that already have to be true for CoW to be working today , namely:

An object's reference count will never be 1 unless there's only one variable (whose lifetime could not have ended already) referencing it. That guarantees we preserve value semantics while still doing the Cow optimization.

SIL has no way to represent that formalism and doesn't need to because reference counts aren't directly observable. It is only the act of observing a variable's uniqueness, via Builtin.isUnique, that forces the variable's reference to be distinct from any other instances of the same reference. This makes it appear to the programmer "as-if" each variable has its own reference, but only for the purpose of uniqueness checking. This model does not provide any way to count higher than one. As I explained, the only way to observe a reference count of 2 would be to (a) observe two independent variables at the same time (b) create a new language feature that forces variables to have distinct reference counts even when they aren't being observed.

Also, even if this uniqueness property were modeled the way you seem to think it is, I don't know how you get from "An object's reference count will never be 1..." to "An object's reference count will never be 2...".

The only sure-fire way to ensure an object has two references is to pass two distinct inout addresses, each of which is known to hold the same object reference.

Again, that statement simply can't be true, at least not without some qualification, or existing code is broken, as I illustrated in my first example.

Let's make sure we don't confuse the statement "a reference count is greater than one" with "a reference count is exactly 2". That said, in this example, the array clearly has reference count==2 before it is copied:

let a = [1, 2, 3]
var b = a
b[0] = 99 // if there isn't a retain for b, this writes on a
assert(a != b)

And in this example, the array might have refcount==2 or 3 when it is copied. It's not determined by language semantics:

let a = [1, 2, 3]
let b = a
var c = b
c[0] = 99
assert(a != c)
_ = b

In this case, you would need both the ArrayBuffer and the SliceBuffer's address at the same time

Well, I don't know about “addresses”—this is high-level Swift after all —but we do have both the ArrayBuffer and the SliceBuffer passed inout at the same time at all points where it's important to detect that there are two references.

This is roughly how isDuallyReferenced would need to be written today, without any additional support (e.g. without some way to force copies).

Builtin.isDuallyReferenced(_ object1: inout AnyObject, _ object2: inout AnyObject) -> Bool {
  do {
    _precondition(object1 == object2)
  }
  return swift_isDuallyReference(Builtin.addressof(&object1),
                                 Builtin.addressod(&object2))
}

I just don't see how it would work for you. The address of the Array struct, which holds reference #1, is not available during ArraySlice._modify.

So, some amount of language support is needed. If not a language feature, then at least a builtin, possibly specific to the Array implementation. Without it, this is what will eventually happen (if you don't mind my pseudo-code):
var array: inout

wasUnique = isUniquelyReference(&array) // true

var rhs = ArraySlice(array._buffer)     // rhs refcount never mutated
The only way you can get in that situation is if the compiler is able to prove that the lifetime of either the array or the slice could have been ended (rule 1 above). I'm pretty certain it's actually impossible for the compiler to prove that if the slice is mutated or copied, because their values will both be observed.

Between the creation of the rhs slice and its last use, there is no way that array's storage can be released. Consequently, the compiler could remove the retain of the non-mutating ArraySlice rhs. I can elaborate on this more if we need to.

Another broken scenerio would be a read-only sliceCopy that escapes.

I don't think that breaks either, for the same reasons. I worked through both of those scenarios before posting here, and you can find attempts to stress those situations in my test code (which I realize doesn't prove anything).

It would be interesting to know why those two scenarios are working in your tests. I would need to put more time into investigating them.

We could force ArraySlice to always retain its storage. This might have noticable performance implications--we could devise an experiment find out.

Hah, the reason this whole thing is needed in the first place is that ArraySlice does always retain its storage for mutation, so I'm pretty sure you will observe no change from that experiment!

ArraySlice does retain its storage when it is mutated. If the compiler does its job, then ArraySlice does not retain its storage when

the ArraySlice is not checked for uniqueness (not mutated)
the ArraySlice lifetime is limited to the Array's lifetime
the Array is not checked for uniqueness within the ArraySlice's lifetime

(A fake uniqueness check on either the ArraySlice or on the array within the ArraySlice's lifetime would be a way to force the copy without any special Builtin, I was just afraid that the compiler would eliminate uniqueness check itself if it's not used)

Assuming the performance is acceptable, I don't know how to actually implement that with existing APIs--we can't use Unmanaged for the ArraySlice without struct-destructors.
[/quote]

Not at all sure what you're getting at here. As I mentioned to Michael, you don't need Unmanaged to make this scheme work sufficiently for the non-parallel-mutation case. An ugly demonstration is here, but it uses only safe constructs.

I'm ruling out the possiblity of implementing manual reference counting on ArraySlice in case that isn't already obvious.

It would be sufficient to add an opaque builtin and SIL instruction that simply takes AnyObject and produces a new one without the compiler knowing that they are in fact the same reference.

The complexity of doing that probably needs to be weighed against some sort of copy-on-escape support, which would have both the performance and semantic properties that we want.

That's all pretty vague so far; if you know a better way to get the right results, and could lay out the details, and someone's going to implement it, I'm all for it. Otherwise, we've waited 5 years for this to be fixed and here's a way that I, at least, believe could be replaced when and if something better comes along.

I haven't tried to design copy-on-escape, so I just don't know how difficult it will be.

Again, we can take a copy-always approach instead of a copy-on-escape approach, but it could hurt performance.

[EDIT] It occurs to me why there may be some misunderstanding. When I say "forcing a copy", I mean copying the array storage reference, not copying the contents of the array. At the SIL level, this translates into forcing a retain.

So, in ContiguousArray.subscript(bounds:), we need some way to ensure that this variable initialization always results in a retain. Conceptually:

var rhs = ArraySlice(_buffer: _buffer[bounds])
blackhole(&rhs._buffer._owner)

When I say "there may be a performance penalty" I mean that there will be extra retain/release calls when slicing an array, and some optimizations may be inhibited.