[Pitch #2] Safe Access to Contiguous Storage

We need to take one step at a time. These are not baby steps -- they are huge leaps, and each of them needs focused attention.

New container patterns are certainly coming, and they will heavily build on non-escapable types. Span will play a particularly prominent role in them. We cannot pitch them without having introduced these concepts first.

I fully expect that Span will ship alongside the first wave of new container protocols, as a unit.

It is crucial to note that "Span-providing" and "non-contiguous" aren't mutually exclusive concepts. We intend Span to be the basic unit of iteration of all borrowing sequences.

At long last, Span gives us an opportunity for data structures to directly expose their contents in their actual native form: as piecewise contiguous series of storage buffers.

A very significant limitation with the classic Sequence is that it insists on iterating over elements one by one. This can induce slowdowns up to a factor of 100x or more for interfaces that take generic sequences. This provides a horrible excuse for API designers to avoid using generics, or to spend effort on trying to manually specialize common cases (like you describe). Sequence itself attempts to mitigate this with a ragtag assortment of one-off hooks like _copyToContiguousArray, _copyContents or _customContainsEquatableElement. The reason these remained forever stuck in the limbo of underscores is that they are hyperfocused on an overly narrow set of cases (and to be honest, they don't do a particularly great job at those, either). The performance issues they patched were important enough that removing them wasn't ever an option, but they are also not general enough solutions to consider making official.

Span gives us a safe way to iterate over arbitrary contiguous chunks of storage instead, promising to generally resolve this critical performance bottleneck, especially in unspecialized generic code. For a taste, imagine if IteratorProtocol was defined like this:

protocol IteratorProtocol<Element> {
  associatedtype Element
  mutating func nextChunk(maximumCount: Int) -> depends(self) Span<Element>
}

This introduces a two-tiered iteration process: we iterate over native storage chunks, separately iterating over the elements of each. Crucially, the lower layer is operating over Span, a tiny, transparent type that can always be specialized -- allowing iteration on that level to proceed at full speed, entirely avoiding dynamic dispatch.

9 Likes