StringProtocol's views are not their own protocols, but constrained Bidi collections:
associatedtype UTF16View : BidirectionalCollection
where UTF16View.Element == UInt16, // Unicode.UTF16.CodeUnit
UTF16View.Index == Index
We could make these APIs generic with the same set of constraints. Alternatively, we could make concrete overloads for Substring.XView.
I view this SE as more of a solution for transitioning to Swift 5 than the final desired approach (more on that below), so I was erring on the side of being concrete and as close to the actual use as possible. But these alternatives are also totally reasonable. What do others think?
If this is the worst fallout from a pretty fundamental shift, then I'm very happy.
encodedOffset will be an offset into whatever encoding the String happens to be encoded with, which is hidden from the developer. It cannot give a specific encoding's offset without access to the contents of the String.
As a transitional solution (more on a potential final solution below), this is a blessing. String.Index is a relatively infrequent namespace that the current problematic code is using. By adding to that namespace, we give the most direct migration path to users without throwing in a lot of API bloat to a higher-visibility namespace.
If a better solution comes, we can choose to deprecate these and it might even be something simple enough that the migrator can update for the developer.
That's interesting. I was following precedent from SE-0180. Either are fine with me, whichever the community sees as most consistent. NSRange's initializer from Range<String.Index> uses in.
The goal is to provide solutions matching the semantics of the old code as close as possible. Since those initializers were not failable, we chose to make these not fail either. Trapping would also be a semantic difference, as the old initializer wouldn't trap on construction, but only when you actually try to use the result.
If an out-of-bounds offset is given, we need to return some index that will trap on access, so we could choose either between endIndex (the canonical out-of-bounds index), or some index beyond endIndex. I could see arguments for either approach, but an API providing beyond-end indices isn't really precedented in Swift AFAICT.
Old code would of produced an index that traps on access, as any occurrence of negative encoded offsets represents a more serious bug and we need to trap for safety. Returning startIndex would produce a valid non-trapping index, which is semantically different than current usage.
I think this is what the final solution may look like:
extension [Bidirectional/Ordered?]Collection {
func index(atOffset offset: Int) -> Index? {
if offset < 0 {
// Non-Bidirectional traps instead
return index(endIndex, offsetBy: offset, limitedBy: startIndex)
}
return index(startIndex, offsetBy: offset, limitedBy: endIndex)
}
subscript(offset offset: Int) -> Element? {
guard let idx = index(atOffset: offset) else { return nil }
return self[idx]
}
subscript(offset range: Range<Int>) -> SubSequence {
subscript(offset range: Range<Int>) -> SubSequence {
let lower = index(atOffset: range.lowerBound) ?? endIndex
let upper = index(atOffset: range.upperBound) ?? endIndex
guard lower <= upper else { return self[endIndex..<endIndex] }
return self[lower..<upper]
}
...
}
There's the design decision regarding whether we permit negative offsets for Bidi collections, why some things are failable, corner cases for the slicing subscript, etc.
This would give us offset-based subscripting for String and its views. This would also provide a solution to those who want a failable subscript on Array, the ability to use Int indices more conveniently on Slice<Array> based on the slice's start value, and a way to safely use Data (which may be self-sliced).
This is substantially more API that we cannot push through in time for Swift 5, but should begin the process soon after.