Is String's subscript O(n) when given a lot of combining code points?

lorentey · October 19, 2023, 6:55am

/// - Complexity: O(1)

These are "feel-good" annotations, not promising anything in particular, other than a very vague "it'll be fast" -- as in, "feel free to invoke it in a loop if you need to do that".

But an O(1) operation can still take a month to execute. (Or a century. A million centuries!) Just because an operation has O(1) complexity, we do not get a free pass to repeatedly invoke it on the same inputs.

You can tell these promises aren't very tangible by the way they don't specify what units they're using to measure "complexity". They don't tell you this, because doing that would admit that the units vary from type to type, and from function to function.

Complexity is measured by the count of primitive operations that a function performs; what those primitive operations are depends on the abstraction level on which the construct happens to sit.

A simple example would be Int.+, i.e., arithmetic addition of two Int values. We can (probably!) promise that it takes O(1) time to add together two 64-bit integers -- the addition can compile down to a single instruction, and executing that instruction will usually take some finite maximum number of CPU cycles. So for operations on the lowest layers of abstraction, complexity can in fact be expressed in terms of actual elapsed time.

(Note that this is heavily assuming that the integer values don't need to get loaded from / stored to a memory address, and that the instruction is located in physical memory. Estimating time can get really complicated if that's not the case, as we'll see next.)

A more complicated example is Array's subscript getter. That is a far, far more complex operation than integer addition, even though Array is (conceptually) one of the simplest possible data structures. Its indexing subscript getter measures its complexity by the number of times it needs to access or copy array elements.

This does give us a good indication of how complex the operation is -- the vague "O(1)" tells us not to be afraid to use it as often as necessary, and that is good advice.

But "O(1)" has practically nothing to with the upper limit of how much time it can take to execute array[i]:

The selected unit of complexity very nicely avoids having to deal with the case when the Array happens to be backed by a bridged NSArray instance. NSArray subclasses (and proxy classes pretending to be one) can do whatever weird thing they want to do, including getting blocked for an arbitrary length of time on storage access or networking events. (Array still isn't lying -- it's performing a constant number of accesses, even in this case. An "access" can simply include dynamically calling out to some random Objective-C method.)
Copying an item that is (or contains) a strong reference includes incrementing its reference count; that operation involves an atomic compare-exchange loop that can, in theory, spin for an arbitrary amount of iterations. (In practice it usually succeeds in a few tries. But it might take a million! This does make it tricky to promise any particular strict upper bound on elapsed time -- which is why people get sometimes so worked up about whether certain atomic operations are "wait-free" or not. swift_retain isn't wait-free.)
Worse, memory is a ridiculously complicated construct in our machines. Even ignoring virtual memory, accessing a byte at a random offset inside a contiguous memory region of size n is better approximated as taking O(log(n)) time, not O(1). We just don't usually need to remember this embarrassing fact because caching works so well. (Until we encounter an algorithm where it doesn't, and suddenly everything slows to a crawl.)
Further, the process that runs our code will sometimes randomly suspend execution in the middle of an array access; usually there is no strict upper time limit on how long it will take to resume execution. (This point is easy to handwave away by only measuring time while our process is running, but if we're actually limited by some real life deadline (e.g. because we need to fill the next buffer's worth of audio data in time before the last one finishes playing), then it can easily become a real problem.)

So while let value = array[i] will perform O(1) element accesses/copies, it can still take an arbitrary amount of time to do that. There is no constant upper bound on its time complexity.

To sum it up: "Complexity: O(1)" does not mean what we generally take it to mean, even in the simple case of Array.subscript. There is sleight of hand trickery here, but it is there for a good reason -- explaining all of this directly in the API reference docs would not be at all helpful. (Even if we omitted the last two points as nitpicky nonsense, as they arguably are.)

String's indexing subscript getter also has "O(1) complexity". However, String's (default) character view sits on a much (and I mean much) higher abstraction level than Array, so its primitive operations are correspondingly more complicated, and they have an even weaker relation to elapsed time.

The primitive unit of operation in this case is finding the next grapheme cluster boundary after a previous one. (I.e., it involves visiting a single Character's worth of Unicode scalars.) In strings containing human text, the average number of scalars this operation needs to visit is typically some low value close to (but definitely higher than) 1. But there is no upper limit on how many scalars would need to be visited -- there is no upper bound to a Character's length.

The "O(1)" promise simply gives us a free pass to iterate over the characters in a string without overly worrying about how much time it will take to do so. The cost of one full iteration pass (as measured in memory accesses) will be proportional to the length of the string in bytes; and this should align with what a reasonably well-informed person would expect, given String's API.

(Note that String's other views have very different complexity units.)

The last interesting example to look at is Dictionary's keying subscript getter. Its complexity is surprisingly difficult to understand/explain. The primitive operations in this case are calculating the hash of a key, and comparing two keys for equality. (Accessing keys and values matters too, of course, but they are significantly cheaper than these two operations. The cost of hashing/comparing keys also includes the costs of accessing them.)

Note that the complexity of these latter two operations is inherently linear in the "size" of the key: this is actually very similar to how String measures things in terms of variable-sized grapheme clusters. The units once again do not (necessarily) have a constant size!

Dictionary's lookup complexity is particularly difficult to explain, because it is a probabilistic data structure. Its efficiency relies on hashing, which (if done properly [very big if!]) allows it to have O(1) expected complexity. The word "expected" here has the specific meaning used by probability theory, as in "expected value" -- and as usual, it is tied to specific assumptions about the distribution of the input (which get lifted by employing randomly seeded universal hashing).

If we applied the methods we used to analyze Array/String indexing to Dictionary lookups, we'd get that the complexity of a Dictionary lookup is O(count). There can be no better upper bound, because no matter what hash function we use, we're in effect sorting a (potentially) infinite number of different keys into a finite number of buckets. There will always be an arbitrarily large number of different keys that share the same bucket, and if we construct a dictionary from some of those clashing keys, then it will always exhibit this worst-case O(count) lookup performance.

However, as long as hashing is implemented properly, these cases will be guaranteed to be incredibly rare: the expected value of key comparisons during a Dictionary lookup is O(1), no matter what set of keys we use. (If the Key type broke its hashes, then this analysis goes out the window, and we revert to O(n) complexity -- we'll be able to easily produce keys that actually exhibit worst-case performance.)

Even if everything works as well as possible, that is, even if we find the key we want on the first try, it will still take a Key.== invocation to verify this -- and that comparison is going to generally cost time proportional to the size of the key. If the Dictionary has variable-sized keys (e.g. String keys), then looking one key up can still cost an arbitrarily large amount of time, even if we measure time by storage accesses, like Array does. But this cost is effectively irrelevant in practice: of course we need to compare at least one pair of keys to find the one we are looking for! That's part and parcel to looking anything up in a general-purpose container.

I think it would not be unreasonable to have Dictionary.subscript document its complexity in the same infuriatingly over-simplified way as Array does:

- Complexity: Expected to be O(1), if `Key` correctly conforms to 
    `Hashable`. Otherwise O(`count`).

For String.subscript, perhaps it'd be useful to include a couple of words about the granularity of our units. What couple of words, though? E.g., would this be enough?

- Complexity: O(1) `Character` extractions.

(Is there a better alternative? Copying and pasting something like the multiple paragraphs of really badly explained gibberish above would only serve to mislead and confuse people even more.)