I ran into this question on StackOverflow, which got my curious:
I'm aware with the caveats associated with breaking up extended grapheme clusters along incorrect boundaries, and creating broken substrings with invalid unicode character sequences.
Is there a safe way to take an index from one string (e.g.
stringA below), and transform it so it points to the same range of characters in another string (
stringB below), without jumping through the hoop of manually deriving the
distance like so:
let stringA = "abc👨👩👧👦xyz" let stringB = "1234567890ABCDEFGHJI1234567890" // Long enough so we can show the issue below, and not just crash let fourthCharIndexA = stringA.index(stringA.startIndex, offsetBy: +3) print(stringA[fourthCharIndexA]) // 👨👩👧👦 print(stringB[fourthCharIndexA]) // Invalid: 4567890ABCDEFGHJI12345678 // EDITED: below used to be `stringB.distance...`, which was a typo. let distance = stringA.distance(from: stringA.startIndex, to: fourthCharIndexA) // the distance is string-agnostic, right? let fourthCharIndexB = stringB.index(stringB.startIndex, offsetBy: +distance) print(stringB[fourthCharIndexB]) // Correct: 4
Further more, is there an API to do the same transformation, but to