String O(1) access to Character

Unlike Slice (which wraps an underlying Collection and provides a window into it), creating an ArraySlice creates a new buffer. So this use of ArraySlice is effectively the same as creating an Array of the string's characters (i.e. using ArraySlice like this is pretty much the same as writing Array(source)[...]). This is how it is able to give a hard guarantee of O(1) access, unlike Slice which "passes on" the complexity of the type that it wraps.

This means that the code does provide constant-time access to individual characters, but it does it by allocating a new buffer, not sharing common storage. This buffer will have Character-size elements. A Character is 16 bytes, which means almost all characters will use far more storage than they need. Conversely, the emoji that don't fit inline within that 16 byte storage will allocate additional memory (unlike a string, where all the characters are stored contiguously in the string's buffer). So this ends up being a fairly inefficient way of doing this.

Another option if you do want to make this space tradeoff to gain random access would be to create an array of the indices i.e. Array(source.indices). This is still going to be using more memory than you need, but 8 rather than 16 bytes, and won't ever allocate additional storage. You can then access the individual elements at random via source[idxs[4]].

But the next question is, why do you need random access to graphemes? There are some algorithms you might want it for, and so are willing to pay the cost of allocating memory to ensure it, but in many cases, there are other ways to achieve what you want without random access, by writing code that moves progressively through the string from start to end, using index(_, offsetBy:) and similar indexing API. These are fiddlier to use, but does mean that what you are trading off is programming model simplicity vs space, rather than time vs space. Hopefully there are further API that can be explored that make common tasks simpler, without compromising on efficiency.

8 Likes