Continuing the discussion from [Pitch] 128 bit Integer Types:
I'm interested in thinking this through more.
Benefits
Separate storage for the optional bit vs the value is potentially a pretty significant win. For a start it can greatly compact the value storage (e.g. eight Int64
s per 64 byte cache line instead of only four), which saves memory and makes accessing the array faster (by virtue of less memory traffic, better cache utilisation, lower average latency, etc).
It can be an even bigger win for many operations, like 'compactMapping' nil values out - scanning the optionality bitmap is much faster than the values bitmap.
Similarly many existing collection methods can benefit from this, although it might require new variants of them in many cases, since e.g. the existing filter(by:)
expects to visit every element, nil or not. I suspect many uses would only care about the non-nil values so could benefit from new methods like (mutating) nilling(if:)
and a nonNil
"modifier" like lazy
which vends a view with the Optional
wrapper removed.
Spin-off collection types
There's also the possibility of further compacting the storage to not have empty spots for nil values. This does make look-ups more expensive since you can't just do a naive base + index * size, so it would only make sense for certain types of use / collections (i.e. not RandomAccessCollection
).
Challenges & limitations
I don't know whether it's possible modify the existing collections in this manner, but even if not it might make sense to have new variants, e.g. CompactOptionalsArray
in swift-collections or whatever.
To elaborate on that, one of the concerns raised is that Array
provides direct access to its storage, which is then ostensibly requiring cohabitated layout of the Optional
with its wrapped type. Hypothetically an intermediary view could be introduced which provides the illusion of the existing "inline" layout, but it seems impossible (or at least impractical inefficient) to do this in a way that provides truly direct memory access (UnsafeBufferPointer
et al).
Even if so, precluding the existing Array
from adopting this, that seems like functionality that's not often used (?) and so wouldn't be a significant dampener against new collection types.
Existing implementations?
I haven't encountered any implementations of collections like this, although I haven't specifically & thoroughly looked for them. Anyone happen to know of any?