I am receiving UTF16-encoded strings from a server as well as byte offsets for accessing substrings. These byte offsets are all for the UTF-16 text that is received and can be assumed to not fall in the middle of a code point.
// From server
let received = Data(...)
let startByteOffset = 0
let endByteOffset = 4
So far I have mostly been working with the strings as Data, extracting portions by subscripting the data and converting those into strings:
let substring = String(data: received[startByteOffset ..< endByteOffset], encoding: .utf16)
But now I would like to display the received string and highlight the various substrings represented by the byte offsets.
Is there a safe way to convert these byte offsets to indexes in a Swift string?
let stringValue = String(data: received, encoding: .utf16)
let rangeToHighlight = // How do I safely build this from the byte offsets?
String.Index(utf16offset:in:) should do the trick (assuming that all offsets are valid for the string and do not refer to a position in the middle of a UTF-16 surrogate pair):
let from = String.Index(utf16Offset: startByteOffset / 2, in: stringValue)
let to = String.Index(utf16Offset: endByteOffset / 2, in: stringValue)
let rangeToHighlight = from..<to
I do not know what happens if the range encloses a single low or high surrogate, and how such a situation could be detected. As I understand it from [Accepted] SE-0241: Explicit Encoded Offsets for String Indices, this is considered a programmer error.