SourceKit get 2D character indices (instead of byte indices)

the problem is in getting a mapping from the byte offsets to character offsets. Ideally, the highlighter shouldn’t care about the contents of the text buffer, it should just pass it opaquely to SourceKit and get character-indexed tokens in return, since otherwise we’d have to use ICU and find the character boundaries within the highlighter, and then search them to map the byte offsets. I’m already doing basic text buffer preprocessing to catch newlines so the 1D indices can be converted to 2D, but the lag time is about at the upper limit of what you would notice while typing (it’s currently only really usable for swift files <1000 LOC, though that’s more Atom & javascript’s fault). Redoing grapheme breaking (which I assume, SourceKit is already doing internally) would probably increase the lag to unacceptable levels. Could SourceKit expose the character indices directly?

Here’s some data on the latency: Using Github's Atom as a Swift IDE for Linux and Mac - #49 by taylorswift

Javascript is mostly to blame but we really don’t have many milliseconds to spare as a result.