In my mental model, there is no need to use any conversion methods -- String's various views all share the same index space, so I can safely pass an UTF-32 index to a UTF-8 view of the same string, and I'm golden:
let café = "🦜cafe\u{301}"
Array(café) // ⟹ ["🦜", "c", "a", "f", "é"]
Array(café.unicodeScalars) // ⟹ [129436, 99, 97, 102, 101, 769]
Array(café.utf16) // ⟹ [55358, 56732, 99, 97, 102, 101, 769]
Array(café.utf8) // ⟹ [240, 159, 166, 156, 99, 97, 102, 101, 204, 129]
café.indices.map { café.unicodeScalars[$0] } // ⟹ [129436, 99, 97, 102, 101]
café.indices.map { café.utf16[$0] } // ⟹ [55358, 99, 97, 102, 101]
café.indices.map { café.utf8[$0] } // ⟹ [240, 99, 97, 102, 101]
café.unicodeScalars.indices.map { café.utf16[$0] } // ⟹ [55358, 99, 97, 102, 101, 769]
café.unicodeScalars.indices.map { café.utf8[$0] } // ⟹ [240, 99, 97, 102, 101, 204]
café.unicodeScalars.indices.map { café[$0] } // ⟹ ["🦜", "c", "a", "f", "é", "\u{301}"] ☆☆☆
café.utf16.indices.map { café[$0] } // ⟹ ["🦜", "🦜", "c", "a", "f", "é", "\u{301}"] ☆☆☆
café.utf16.indices.map { café.unicodeScalars[$0] } // ⟹ [129436, 129436, 99, 97, 102, 101, 769] ★★★
café.utf16.indices.map { café.utf8[$0] } // ⟹ [240, 240, 99, 97, 102, 101, 204] ★★★
café.utf8.indices.map { café[$0] } // ⟹ ["🦜", "🦜", "🦜", "🦜", "c", "a", "f", "é", "\u{301}", "\u{301}"] ☆☆☆
café.utf8.indices.map { café.unicodeScalars[$0] } // ⟹ [129436, 129436, 129436, 129436, 99, 97, 102, 101, 769, 769] ★★★
café.utf8.indices.map { café.utf16[$0] } // ⟹ [55358, 55358, 55358, 55358, 99, 97, 102, 101, 769, 769] ★★★
This behavior is well-defined for all combinations. I agree that rounding off partial positions (exhibited in lines marked with ★★★) is unusual. (Arguably, this isn't quite consistent, either: String is vending partial elements, as seen in lines marked ☆☆☆, while other views aren't.)
The samePosition and init?(_:, within:) APIs are there to use when this behavior is undesirable. However, there shouldn't ever be a need to call them twice in a sequence, like in your option (2) -- this should do just fine:
func use(index: String.Index) {
// We need to assume that `index` is a valid index that came
// from some view of `café`. There is no way to verify this,
// other than trying to use it and to see if it traps or results
// in nonsensical values.
// Now if for some reason we *need* `index` to fall on a scalar
// boundary, we can call `samePosition(in:)` or `init?(_, within:):`
// to ensure this:
precondition(index.samePosition(in: café.unicodeScalars) != nil,
"Invalid index: not on a scalar boundary")
// `index` is okay
...
}
However, this example is probably too abstract. It's strange for a function to take a standalone string index without also taking an explicit string parameter. Indices are meaningless without their corresponding collection instance, and the ambiguity goes away the instant we provide context using a particular string view.
For example, here is a way to retrieve the first word from a string (using a highly questionable definition for "word"). It has issues, but its use of indices seems fine to me; I feel that adding an extra conversion step (like we had to in Swift 3) would not help understanding the code at all.
extension String {
var firstWord: Substring? {
guard let i = self.utf8.firstIndex(of: 32) else { return nil }
return self[..<i]
}
}