Solving the mutating slice CoW problem

Andrew_Trick · April 25, 2020, 2:37am

Here is the explanation I promised. This post only relates to the soundness of the implementation in GitHub - dabrahams/DivideAndConquer: Demonstrates a fix for the mutating divide-and-conquer COW problem, which I'll refer to below as "DivideAndConquer". Responding in-line point-by-point won't converge in a useful place. Instead I wanted to deal with this specific aspect coherently in one place. This post has no bearing on other important issues, notably:

a path to generalizing the design to other collections
guaranteeing non-copying behavior at -Onone
incurring Array implementation complexity--the initial
implementation costs are insiginificant relative to the long-term
costs

Let's keep those issues in separate posts.

Fix for Divide-And-Conquer

+++ b/Sources/DivideAndConquer/ContiguousArray.swift

   public subscript(bounds: Range<Int>) -> ArraySlice<Element> {
     get {
       _checkIndex(bounds.lowerBound)
       _checkIndex(bounds.upperBound)
       return ArraySlice(_buffer: _buffer[bounds])
     }
     _modify {
       _checkIndex(bounds.lowerBound)
       _checkIndex(bounds.upperBound)

       let wasUnique = _buffer.isUniquelyReferenced()
       var rhs = ArraySlice(_buffer: _buffer[bounds])
+      // Any runtime function that the compiler can't make any
+      // assumptions about and that takes the address of 'rhs' will
+      // work here.
+      strongRefCount(&rhs)

Test Case Requiring Fix

@inline(never)
func testEscapeWithTripleReference(array: inout ContiguousArray<Int>) -> ArraySlice<Int> {
1:  return copyAndMutate(slice: &array[0..<4])
}

@inline(__always)
func copyAndMutate(slice: inout ArraySlice<Int>) -> ArraySlice<Int> {
  // Why was `slice` passed inout? Maybe it used to be
  // `slice[0..<1].reverse()` which is inlined to nothing.
2:  var sliceCopy = slice
3:  sliceCopy[0..<2].reverse()
4:  return sliceCopy
}

var array = DivideAndConquer.ContiguousArray(0..<5)
let slice = testEscapeWithTripleReference(array: &array)
assert(array[0] != slice[0])

Explanation of the Bug

I was not actually able to break the above test, even after much massaging. However, I believe SIL rules allow the case to break, and I think ARC optimization based on OSSA form after inlining would be sufficient.

Here is the sequence of operations, where 'owner' always refers to the CoW storage object:

ContiguousArray.subscript(bounds:)_modify

a. wasUnique = array.isUnique() = true
b. 'array.owner.isBorrowed' = true
c. retain 'slice.owner'

Copy ArraySlice

d. retain 'sliceCopy.owner'

ArraySlice.subscript(bounds:)_modify

e. isDuallyReferenced() = false (refcount==3)
f. borrowedOwner = nil
g. .reverse() copies storage
h. borrowedOwner == nil

return 'sliceCopy'

i. retain 'returnedSlice.owner'
j. release 'sliceCopy.owner'

Optimization: remove the retain/release of 'sliceCopy' based on proving all of these conditions:

'sliceCopy' is never checked for uniqueness (or mutated in any way)
'array' outlives 'sliceCopy'
'array.owner' is immutable during the lifetime of 'sliceCopy'

Now we have:

a. wasUnique = array.isUnique() = true
b. 'array.owner.isBorrowed' = true
c. retain 'slice.owner'
d. reuse 'sliceCopy.owner'
e. isDuallyReferenced() = true (refcount==2)
f. release 'borrowedOwner' (refcount==1)
g. .reverse() performs in-place mutation
h. retain 'borrowedOwner' (refcount==2)
i. retain 'returnedSlice.owner' (refcount==3)
j. ignore 'sliceCopy.owner'

This clobbers the original array.

Explanation of the Bug Fix

ContiguousArray.subscript(bounds:)_modify now takes the address of the mutable ArraySlice that it yields. This guarantees that it's 'owner' reference is opaque to the compiler. I used a call to strongRefCount(&rhs), but let's generalize this concept to _blackHole(inout AnyObject). We need two guarantees:

The compiler assumes that '_blackHole' may release its inout argument.
The compiler assumes that '_blackHole' cannot be eliminated.

The compiler can now never eliminate this operation:

c. retain 'slice.owner'

..because the retained reference may immediately be released by '_blackHole'. Even after the call to '_blackHole', it cannot share the representation of 'array.owner' and 'slice.owner', because it has no way to prove that they are the same value.

Consequences of the Bug Fix

For ArraySlice rvalues:

Implementation: Array.subscript(bounds:).get with no corresponding writeback
Returned ArraySlice may or may not be retained.
ArraySlice storage is copied on mutation (after copying to an lvalue).
Unaffected by the Divide-And-Conquer.

For ArraySlice lvalues today

Implementation: Array.subscript(bounds:).get + Array.subscript(bounds:).set:
Returned ArraySlice is retained and its storage is copied whenever
it is mutated (after optimization)
Returned ArraySlice may not be retained if it is not mutated
e.g. array[0..<1].reverse() never mutates the slice

For ArraySlice lvalues with (fixed) DivideAndConquer

Implementation: Array.subscript(bounds:)._modify
Returned ArraySlice is retained whenever it is mutated (after optimization)
Storage is never copied unless the slice is copied.
Returned ArraySlice may be retained if it is not mutated
e.g. array[0..<1].reverse() requires a retain and can't be fully eliminated.

DivideAndConquer trades-off an extra retain in a corner case (divide-and-conquer base case) for eliminating a copy of the storage in the common case (ignoring the other overhead of borrowed status checks).

Explanation of Compile-Time Reference Counting

Axiom 1: At a single program point, any two instances of the same value (identical values) can share a representation.

Theorem 1: for any reference-counted type, checking uniqueness requires apparent mutability of the reference. Otherwise, its reference may share its representation with other instances of the same value.

Commentary: The source of confusion around this stems from the variable scopes that appear evident at the source level. But the presence of those scopes do not subvert the rules above.

While it is true that these two forms may have different semantics in some resepects:

Form 1:

let x = foo()
bar(x)
_ = x

Form 2:
bar(foo())

Those semantics have no impact on whether the implementation can share the representations of foo's returned value, the value in variable x, and bar's argument. Simply put, a single reference will suffice. This is fundamental to the compiler's model of the language.

Axiom 2: A reference passed as an argument (whether owned or inout) may be released no more than once without at least one retain for every additional release.

ARC can be reduced to a counting problem where each operation has a net effect on its operand's reference count.

var obj: AnyObject

retain(obj):  +1
release(obj): -1
apply(obj):   -1..<infinite (possible owned convention)
apply(&obj):  -1..<infinite (reassignment is a release)

The only constraint on the compile-time implementation is proving that an object's reference count does not reach zero before its last use. It's the same exercise required for manual reference counting.

Theorem 2: The only fully general way to check refcount==2 is by simultaneously mutating two distinct references to the same object. For example:

Builtin.isDuallyReferenced(_ object1: inout AnyObject, _ object2: inout AnyObject) -> Bool {
  do {
    _precondition(object1 == object2)
  }
  return swift_isDuallyReference(Builtin.addressof(&object1),
                                 Builtin.addressod(&object2))
}

You may be thinking "can't the compiler have special magic operations".

Yes, the compiler can have all the magic it wants, but that does not change any of these facts. Those magic operations exist within APIs, and those APIs must adhere to the calling conventions of the language. The only way out of the rules I've put forth above would be a language level calling convention that provides for a single reference argument to be net released more than once (without corresponding retains). I'm comfortable stating that we will never have such a convention.

Axiom 3: for any CoW type, mutability implies uniquenes.

Theorem 3: From Theorem 1 and Axiom 3. For any CoW type, mutability and uniqueness are mutually implied. These conditions can be used interchangeably. (I find this useful to simply the discussion).

Let's apply these rules to DivideAndConquer.

The ArraySlice does not have access to the Array when checking isDuallyReferenced(). With access to only one reference, isDuallyReferenced() does not prove anything on its own.

To rely on isDuallyReferenced(), we must guarantee that ArraySlice "owns" a distinct reference to storage. That can be guaranteed if any of these statements are shown true:

the ArraySlice is checked for uniqueness (i.e. mutated)--and this
check cannot be removed.
the ArraySlice lifetime is surpasses the Array's lifetime
the Array is checked for uniqueness within the ArraySlice's
lifetime--and this check cannot be removed.

We do know that the Array will be accessed again after ArraySlice's lifetime. That does not help.

We do not otherwise have any control over Array. Therefore, the only option is forcing ArraySlice to be apprently mutated regardless of how much information the compiler has. Note that this is exactly equivalent to forcing the ArraySlice's owner reference to be copied, which is exactly equivalent to forcing a retain. A Builtin.forceRetain(inout AnyObject) would have the same effect.

Conclusion

DivideAndConquer is a working implementation of in-place ArraySlice mutation, given a fairly nonintrusive fix.

isDuallyReferenced(inout AnyObject), as written, should not be exposed as a user-level API.