Index Invalidation in subscript

On this forum we have previously discussed the rules around index invalidation: see NSString Indices invalidated potentially after it has been bridged to String and manipulated for at least one example.

I want to ask a clarifying question: is it acceptable to invalidate indices when assigning to a MutableCollection subscript?

As a hypothetical, if I was rebuilding Array, would it be acceptable for me to use the following implementation?

extension MyCollectionStorage {
    private var backingStore: UnsafeMutableRawPointer

    struct Index {
        fileprivate var backingPointer: UnsafeMutableRawPointer
    }

    subscript(_ index: Index) -> UInt8 {
        get {
            precondition(self.isValidIndex(index))  // Pointer is in bounds
            return index.backingPointer.load(as: UInt8.self)
        }
        set {
            precondition(self.isValidIndex(index))  // Pointer is in bounds
            // This isn't quite right, but I want it for the example
            if !isKnownUniquelyReferenced(self) {
                let oldBaseAddress = self.backingStore
                let pointerOffset = index.backingPointer - oldBaseAddress
                self.backingStore = self.copyBackingStore()  // Left to the reader
                let newPointer = self.backingStore + pointerOffset
                newPointer.store(newValue)
            }
            index.backingPointer.store(newValue)
        }
    }
}

The point of this example is to note that, in the !isKnownUniquelyReferenced case, we invalidate the indices. There is no data structure in the standard library so far as I know that does this today, but the MutableCollection documentation is silent on whether a MutableCollection is allowed to do this.

What's the guidance here? May this kind of index invalidation occur, or are indices required to be stable across subscript set?

2 Likes

I think (albeit hesitantly) that MutableCollection semantically requires indices to remain valid.

  1. Without it, swapAt(Index, Index) doesn’t even make sense. How would you perform two separate insertions without the first invalidating the second?

  2. The documentation seems to imply it under “Conforming to the MutableCollection Protocol”:

    A value stored into a subscript of a MutableCollection instance must subsequently be accessible at that same position. That is, for a mutable collection instance a, index i, and value x, the two sets of assignments in the following code sample must be equivalent:

    a[i] = x
    let y = a[i]
    
    // Must be equivalent to:
    a[i] = x
    let y = x
    

    Technically that only actually means that a mutation must not invalidate the same index that described the mutation, but in most real use cases that would be the very index that needs invalidation first (e.g. overwriting a base letter with a combining character in a String would break the above condition even if String indices were plain Integers).

  3. See the following tangent in another thread, interspersed with other stuff, but starting about here:

I agree it would be helpful if the documentation stated it plainly one way or the other.

4 Likes

This seems like the most compelling argument to me, but we should definitely have a better documentation story.

I agree the swapAt argument is compelling, however it is a customisation point so the underlying MutableCollection can make that work.

Furthermore, I think that for a self-slicing byte containers like Data using an opaque version of the underlying pointers as @lukasa outlined above would indeed be a very compelling design. It would probably be slightly more efficient and also would correctly prevent the accidental use of indices of one Data instance to index another. As @lukasa says that does (for CoWing value types ) however require that indices may be invalidated on every mutating operation.

In any case, this is something that crucially needs to be documented. I would also argue that we should have some sort of 'collection compatibility test' which verifies some basic assumptions about (Mutable)Collections.

(edit below)

Re-reading this, I think this is the documentation I was looking for and it seems to explicitly forbid using an opaque version of the underlying pointers as indices :frowning:. Because this

a[i] = x

could trigger a CoW which would then invalidate i for the value a so the following a[i] would no longer work.

CC @lorentey/@Ben_Cohen

2 Likes

+1

1 Like
Terms of Service

Privacy Policy

Cookie Policy