lukasa
(Cory Benfield)
1
On this forum we have previously discussed the rules around index invalidation: see NSString Indices invalidated potentially after it has been bridged to String and manipulated for at least one example.
I want to ask a clarifying question: is it acceptable to invalidate indices when assigning to a MutableCollection subscript?
As a hypothetical, if I was rebuilding Array, would it be acceptable for me to use the following implementation?
extension MyCollectionStorage {
private var backingStore: UnsafeMutableRawPointer
struct Index {
fileprivate var backingPointer: UnsafeMutableRawPointer
}
subscript(_ index: Index) -> UInt8 {
get {
precondition(self.isValidIndex(index)) // Pointer is in bounds
return index.backingPointer.load(as: UInt8.self)
}
set {
precondition(self.isValidIndex(index)) // Pointer is in bounds
// This isn't quite right, but I want it for the example
if !isKnownUniquelyReferenced(self) {
let oldBaseAddress = self.backingStore
let pointerOffset = index.backingPointer - oldBaseAddress
self.backingStore = self.copyBackingStore() // Left to the reader
let newPointer = self.backingStore + pointerOffset
newPointer.store(newValue)
}
index.backingPointer.store(newValue)
}
}
}
The point of this example is to note that, in the !isKnownUniquelyReferenced case, we invalidate the indices. There is no data structure in the standard library so far as I know that does this today, but the MutableCollection documentation is silent on whether a MutableCollection is allowed to do this.
What's the guidance here? May this kind of index invalidation occur, or are indices required to be stable across subscript set?
2 Likes
SDGGiesbrecht
(Jeremy David Giesbrecht)
2
I think (albeit hesitantly) that MutableCollection semantically requires indices to remain valid.
-
Without it, swapAt(Index, Index) doesn’t even make sense. How would you perform two separate insertions without the first invalidating the second?
-
The documentation seems to imply it under “Conforming to the MutableCollection Protocol”:
A value stored into a subscript of a MutableCollection instance must subsequently be accessible at that same position. That is, for a mutable collection instance a, index i, and value x, the two sets of assignments in the following code sample must be equivalent:
a[i] = x
let y = a[i]
// Must be equivalent to:
a[i] = x
let y = x
Technically that only actually means that a mutation must not invalidate the same index that described the mutation, but in most real use cases that would be the very index that needs invalidation first (e.g. overwriting a base letter with a combining character in a String would break the above condition even if String indices were plain Integers).
-
See the following tangent in another thread, interspersed with other stuff, but starting about here:
I agree it would be helpful if the documentation stated it plainly one way or the other.
4 Likes
lukasa
(Cory Benfield)
3
This seems like the most compelling argument to me, but we should definitely have a better documentation story.
I agree the swapAt argument is compelling, however it is a customisation point so the underlying MutableCollection can make that work.
Furthermore, I think that for a self-slicing byte containers like Data using an opaque version of the underlying pointers as @lukasa outlined above would indeed be a very compelling design. It would probably be slightly more efficient and also would correctly prevent the accidental use of indices of one Data instance to index another. As @lukasa says that does (for CoWing value types ) however require that indices may be invalidated on every mutating operation.
In any case, this is something that crucially needs to be documented. I would also argue that we should have some sort of 'collection compatibility test' which verifies some basic assumptions about (Mutable)Collections.
(edit below)
Re-reading this, I think this is the documentation I was looking for and it seems to explicitly forbid using an opaque version of the underlying pointers as indices
. Because this
a[i] = x
could trigger a CoW which would then invalidate i for the value a so the following a[i] would no longer work.
CC @lorentey/@Ben_Cohen
2 Likes