Thanks for all this investigation!
Like most underscored stuff, it's a bit of a mess, and it isn't fit for general consumption. I believe looking through the Swift mailing list archives for discussions about bounds checks around the time the new indexing model was introduced might uncover some of the original thinking behind it.
My vaguely educated guess is that _failEarlyRangeCheck (or rather, its original ForwardIndex-based incarnation) was originally introduced to work around the inability to directly compare indices, and it has since been repositioned as a minor performance hack.
FWIW though, I don't think the original intentions are worth much -- what matters is what the code does in its current, ABI stable, incarnation.
I believe this part of the doc comment originates from a time when Index wasn't Comparable -- it's out of date and it's misleading, and so it should be removed. AFAICT, the default implementation now performs a check because we can now compare indices even in forward-only collections, and we evidently assume that Index.< will be O(1).
(I think this assumption is reasonable, even though Comparable doesn't come with any performance requirements, and IIRC Collection doesn't say anything about the expected performance of index comparisons, either.)
I deeply, deeply mistrust the term "QoI", but if I understand it correctly, the doc comment doesn't mean to use it in a disparaging sense. Labeling index validation as a quality-of-implementation issue (as opposed to a memory safety concern) doesn't mean that it's fine to not do it.
While making _failEarlyRangeCheck a noop must not ever allow code to lead to undefined behavior through an out-of-bounds memory access, custom implementations are allowed to sometimes result in incorrect but still well-defined results when an invalid index is passed to a Collection operation that calls _fERC.
Public collection operations performing O(1) index validation in release builds (whether or not through _fERC) is, in general, a feature, not a bug; these checks must not be disabled. I strongly prefer Swift functions to always return correct values, and to trap when that's not possible.
(I'd personally be okay with Collection providing a separate unchecked subscript, though, in cases where we (think) we can guarantee that the index (or index range) is valid. (One example would be the subscript invocation in IndexingIterator.next(), at least in cases where the base collection implements value semantics.) The point is that the C++ STL got this exactly backwards -- the most convenient, default way to spell access must be the safest, fully validated variant.)
This is crucial for correctness (although not for memory safety). In general, we do not want slices to allow access outside of their bounds:
let slice = foo[3 ..< 10]
print(slice[0]) // Should trap
I do think that these checks are necessary, and omitting them would be bad engineering. Saying that these are merely a quality of implementation concern (as opposed to a memory safety issue) does not mean that we are allowed to ignore them. The stdlib is supposed to be a well-engineered, quality piece of software -- partial functions like the indexing subscript generally mustn't return nonsensical results on unexpected input.
Slice operations perform bounds checking via the _fERC hooks in order to allow Collection authors to have some control over how these checks are done without having to implement a custom SubSequence type. (Personally, I do not think this is a particularly great idea, and these hooks should rarely if ever be exercised. But again, opinions aren't worth much; what matters is what we've committed to the ABI, which is that Slice performs bounds checking through _failEarlyRangeCheck.)
We have previously established (e.g., in PR #34961) that we don't need a runtime check to verify that Range values are valid (i.e., lowerBound <= upperBound), and unless I'm badly mistaken, Slice can only be initialized via its init(base:bounds:) initializer, so this relation also holds for its start/endIndex.
Therefore, I believe it is safe to assume that startIndex <= endIndex for every Slice, which means that it would be okay to replace all _failEarlyRangeCheck(foo, bounds: startIndex ..< endIndex) invocations in Slice with _failEarlyRangeCheck(foo, bounds: Range(_uncheckedBounds: (startIndex, endIndex))). (I think we don't even need debug checks here, so the underscored variant would be fine.)
Note that removing the precondition isn't necessarily going to be a clear win. The compiler doesn't understand that a range's lower bound is guaranteed to never exceed the upper bound, so I expect it may sometimes generate extra code to unnecessarily cover that case in downstream code.
But still, a PR that implements this would be welcome -- especially if you can add a new benchmark that highlights the beneficial effect of removing the condition. (So as to justify accepting the inevitable regressions and to have a chance to catch any future regressions.)
I believe this is to allow custom array-like collections to get the same deferred index validation behavior as Array without having to manually implement these. In my opinion, this is largely pointless -- manually implementing index(after:) etc. seems vastly preferable to implementing an underscored requirement. On the other hand, if I already need to implement _fERC to customize Slice behavior, this seems like a reasonable (if a little dubious) freebie.
These implementations already use the init(uncheckedBounds:) initializers, so they only perform checks in debug builds (and, sadly, opaque parts of the stdlib). Unfortunately, I think the debug mode checks are somewhat load-bearing --- they help catch invalid Collection implementations that violate the startIndex <= endIndex requirement. So replacing these with init(_uncheckedBounds:) calls seems unwise. (It's not entirely impossible though -- the benefits of these debug checks are marginal enough that we could consider removing them if someone could demonstrate a clear performance advantage that couldn't be achieved in some other way. But I don't think debug-only checks are worth spending much time on!)
That this default implementation exists is horrifying, because unlike Collection's variant, this implementation isn't in a conditional extension -- every MutableCollection gets it, no matter if their SubSequence is Slice or not. @glessard is investigating the practical effects of this; it may well require some sort of fix. Thanks for pointing it out!
That MutableCollection's default bounds subscript uses _fERC doesn't seem like a problem to me --the range we pass to this method needs to be validated, and like with the default RAC index(after:), this may just as well be done through these hooks. However, it would be better if the implementation was using Range.init(uncheckedBounds:) to construct the range, not ..<. (We do need the debug checks for the same reason as above.)
As I mentioned above, Collection has a similar default implementation where SubSequence == Slice<Self>. That too uses _fERC, as expected. It also uses ..<, which can be similarly replaced with the uncheckedBounds: the initializer.
(That the default bounds subscript uses _fERC means that if we implement it as a noop in a custom collection that uses Slice, then that collection will allow slices to be created with out of bounds indices -- unless we also implement a custom slicing subscript. However, Slice always accesses elements through its base, so this will not lead to memory safety issues, as long as the base collection properly validates its indices. Note that I'm conveniently ignoring withContiguousStorageIfAvailable here -- although that family of methods raises some complications we should probably address at some point.)
This feels like a cosmetic oversight that we should probably fix. Either Range.index(before:) and Range.index(_:offsetBy:) should also be calling _fERC, or index(after:) should be using direct _preconditions. AFAICT, neither of these options would have any functional impact in optimized code, as Range's _fERC implementation is effectively the same as the standalone preconditions. (In debug builds, the fERC variant does come with a better error message, which is a nice bonus!)
Allowing a select few collection types to customize it is the entire point of having _fERC, I believe.
The Array family does it for no immediately obvious technical reason -- as far as I can tell, for them fERC is just a public entry point; they themselves never call it. I think the same goes for AnyCollection/AnyBidirectionalCollection/AnyRandomAccessCollection -- they are self-sliced and AFAIK they forward all other actual requirements to the underlying base collection.
I'm guessing(!) that the fERC implementations in these types are there in case some collection algorithm decided to call fERC to do its own index validation. (This seems unlikely to ever be necessary, to be honest. A full precondition would work better in all but the most trivial cases.)
The type where fERC really shines is Unsafe[Mutable]BufferPointer, which defines these to use debug assertions. (For the record, I don't like this; but the ABI doesn't at all care about my opinions.)
One important additional type that could benefit from having _fERC implementations is Unsafe[Mutable]RawBufferPointer -- there is a todo item in the code somewhere to add these, and we can do it at any time; all that needs to be done is for someone to submit a PR with the implementation (+ tests). Following the precedence established with Unsafe[Mutable]BufferPointer, the custom implementations should perform debug assertions.
What exactly do you mean? Can you give an example? (I can forward member implementations to a base collection to wrap its behavior just fine, with or without implementing _fERC.)
I think I'd personally be fine with replacing some or all of the calls to the _fERC methods with direct, unconditional preconditions. However, eliminating these checks altogether, or indiscriminately downgrading them to debug-only checks seems like a bad idea to me.
The _fERC methods are performance hacks that allow customizing the behavior of some default collection implementations. It is not at all crucial for Collection authors to implement them, or even know about them; but it's sort of nice that special types like UnsafeBufferPointer are able customize these to trade a bit of performance for a bit of correctness, without having to swallow the API surface bloat of implementing a custom SubSequence.
I don't think it would be worth pushing _fERC through Swift Evolution though -- I doubt we'd want to generally encourage people to eliminate index validation. I really, really hope people aren't trying to implement their own UnsafeBufferPointer; but if for some reason they do, I expect having to learn about underscored and underdocumented methods isn't going to be the most difficult problem.