index(_:offsetBy:limitedBy:)
Returns an index that is the specified distance from the given index, unless that distance is beyond a given limiting index.
func index( _ i: Self.Index, offsetBy distance: Int, limitedBy limit: Self.Index ) -> Self.Index?
Parameters
i
A valid index of the collection.
distance
The distance to offseti
.distance
must not be negative unless the collection conforms to theBidirectionalCollection
protocol.
limit
A valid index of the collection to use as a limit. Ifdistance > 0
, a limit that is less thani
has no effect. Likewise, ifdistance < 0
, a limit that is greater thani
has no effect.
Note the specification of limit
, and that it "has no effect" if it does not match the starting index and direction of offsetting. I would argue that this is not useful behaviour, and I've found that it can require a suboptimal implementation, and would like to suggest that we change it to either:
a) Always return nil
(since, technically, all results would be beyond the given limit).
b) Not specify the result in this case.
Concrete Example
I've been experimenting with a SIMD variant of Collection's contains
function:
extension Collection where Element == UInt8 {
@inlinable
internal func contains_simd(_ element: UInt8) -> Bool {
typealias Vector = SIMD8<Element>
var i = startIndex
while let j = index(i, offsetBy: Vector.scalarCount, limitedBy: endIndex) {
if any(Vector(self[i..<j]) .== Vector(repeating: element)) {
return true
}
i = j
}
while i < endIndex {
if self[i] == element { return true }
formIndex(after: &i)
}
return false
}
}
And how well it works when applied to my custom UnsafeBoundsCheckedBufferPointer
type.
In the first instance, I implement index(_:offsetBy:limtedBy:)
like this:
@inlinable
internal func index(_ i: UInt, offsetBy distance: Int, limitedBy limit: UInt) -> UInt? {
if distance >= 0 {
// All valid 'i' are <= Int.max, so this will not overflow.
// An invalid 'i' is allowed to return a nonsense result.
let result = i &+ UInt(distance)
return result <= limit ? result : nil
} else {
// All valid 'i' are >= 0 and <= Int.max, so this will not underflow.
let result = Int(bitPattern: i) &+ distance
return result >= limit ? UInt(bitPattern: result) : nil
}
}
This generates really nice assembly, which loads a chunk of bytes directly in to a SIMD register with a single bounds-check (and even that I think could probably be hoisted if #71919 were fixed):
.LBB2_9:
add r8, 8
jb .LBB2_13 ; bounds check
movq xmm1, qword ptr [rsi + rdx] ; single load in to SIMD reg
pcmpeqb xmm1, xmm0
pmovmskb eax, xmm1
Behaviour-wise, this is option (a):
// Incrementing, but limit < initial index.
buffer.index(2, offsetBy: 1, limitedBy: 1) // nil
buffer.index(2, offsetBy: 2, limitedBy: 1) // nil
buffer.index(2, offsetBy: 3, limitedBy: 1) // nil
// Subtracting, but limit > initial index.
buffer.index(2, offsetBy: -3, limitedBy: 3) // nil
buffer.index(2, offsetBy: -2, limitedBy: 3) // nil
buffer.index(2, offsetBy: -1, limitedBy: 3) // nil
And I think this is quite reasonable. If you're incrementing, you're saying you don't want the result to be greater than the limit. If the limit had no effect, the results (3, 4, 5) would indeed be greater than the limit (1), and so it is preferable to return nil
IMO.
If I were to implement the behaviour precisely as specified and cause invalid limits to have no effect, I'd need to make the following changes:
@inlinable
internal func index(_ i: UInt, offsetBy distance: Int, limitedBy limit: UInt) -> UInt? {
if distance >= 0 {
// All valid 'i' are <= Int.max, so this will not overflow.
// An invalid 'i' is allowed to return a nonsense result.
let result = i &+ UInt(distance)
+ guard limit >= i else { return result }
return result <= limit ? result : nil
} else {
// All valid 'i' are >= 0 and <= Int.max, so this will not underflow.
let result = Int(bitPattern: i) &+ distance
+ guard limit <= i else { return UInt(bitPattern: result) }
return result >= limit ? UInt(bitPattern: result) : nil
}
}
The result is much worse assembly. Because it is not known whether the limit actually had an effect, the compiler needs to bounds-check every position before loading data in to a SIMD register:
.LBB2_9:
cmp r8, -9
ja .LBB2_21 ; bounds check
cmp rax, rcx
ja .LBB2_22 ; bounds check
lea r8, [rdx - 3]
cmp r8, rax
jae .LBB2_23 ; bounds check
lea r8, [rdx - 2]
cmp r8, rax
jae .LBB2_23 ; bounds check
lea r8, [rdx - 1]
cmp r8, rax
jae .LBB2_23 ; bounds check
cmp rdx, rax
jae .LBB2_23 ; bounds check
lea r8, [rdx + 1]
cmp r8, rax
jae .LBB2_23 ; bounds check
lea r8, [rdx + 2]
cmp r8, rax
jae .LBB2_23 ; bounds check
lea r8, [rdx + 3]
cmp r8, rax
jae .LBB2_23 ; bounds check
movq xmm1, qword ptr [rsi + rdx - 4] ; load in to SIMD register
pcmpeqb xmm1, xmm0
pmovmskb eax, xmm1
Moreover, I would argue this behaviour is not useful. Here's what the difference means in practice:
// Incrementing, but limit < initial index.
buffer.index(2, offsetBy: 1, limitedBy: 1) // 3
buffer.index(2, offsetBy: 2, limitedBy: 1) // 4
buffer.index(2, offsetBy: 3, limitedBy: 1) // 5
// Subtracting, but limit > initial index.
buffer.index(2, offsetBy: -3, limitedBy: 3) // 18446744073709551615
buffer.index(2, offsetBy: -2, limitedBy: 3) // 0
buffer.index(2, offsetBy: -1, limitedBy: 3) // 1
The limit parameter is basically useless (it "has no effect"), allowing us to even get junk values which do not correspond to a valid position in the collection. Now, this is still safe because actual reads from memory will bounds-check this, but the previous implementation would have prevented the junk index in the first place.
So I'd prefer to return nil
in this case, or at least for the standard library to leave me enough room that I can choose to return nil
in the case where the combination of (i, distance, limit)
is not valid.
Previous Discussions
This has come up before in the distant past:
I don't think this necessarily needs to trap (although implementations may do that if they want to). This function already returns an optional, and the condition where the start and end point are in the wrong order is compatible with the existing conditions where the function returns nil
.