[Pitch #2] Safe Access to Contiguous Storage

fclout · June 28, 2024, 5:27pm

I wanted to add that if we intend to make load eventually safe, it probably needs to be implemented as loadUnaligned because it would be undefined behavior to use it on an unaligned pointer (or there needs to be another axis of spans to account for unaligned pointers). The eventual unsafeLoad can assume alignment and keep its unaligned counterpart.

This raises another question: how do we handle view on spans with the wrong alignment?

Alejandro · June 28, 2024, 5:31pm

Note that we are not proposing Inout as part of this pitch. That type is on an experimental branch to experiment with potential future directions.

glessard · June 28, 2024, 5:42pm

You are correct: the unsafe pointer initializers are a convenience when working with unsafe pointers, as well as the way to vend a Span from one's own collection implementation.

Correct. Any actual parsing utilities would probably be part of a targeted proposal.

Span requires an aligned address at initialization time, but there is an exception for Span where Element: BitwiseCopyable, which uses loadUnaligned() in its element getter, so doesn't require alignment.

benlings · June 28, 2024, 6:40pm

Future Directions says that the ContiguousBytes protocol is a future direction. Should this still be part of the proposal?

extension Slice: ContiguousStorage where Base: ContiguousStorage {
  var storage: Span<Base.Element> { get }
}

Naming: ‘Span’ seems like a good choice. As well as fitting with C++, it seems like it fills a similar role to Span<T> in .Net.

I agree with others’ comments about the unsafety of load() and loadUnaligned(). The documentation for the former says:

/// The memory at this pointer plus `offset` must be properly aligned for
/// accessing `T` and initialized to `T` or another type that is layout
/// compatible with `T`.

but doesn’t say what will happen if it isn’t. I think this should either define what happens (e.g. fatal error / return nil optional) or include unsafe in the name.

glessard · June 28, 2024, 6:51pm

benlings:

Future Directions says that the ContiguousBytes protocol is a future directions. Should this still be part of the proposal?
extension Slice: ContiguousStorage where Base: ContiguousStorage {
  var storage: Span<Base.Element> { get }
}

No, this was an editing error. This was removed in an update (see the swift-evolution PR).

We've noted the load-related issues, and will make changes.

Karl · July 1, 2024, 7:18am

We can check alignment quite easily at runtime, so at least that aspect of the operation can be made safe.

And if you check alignment before you enter a loop (which you should do if you're using aligned loads), then add units of MemoryLayout<T>.stride to the pointer address, the compiler should be able to prove that all addresses are aligned and eliminate those runtime checks.

Karl · July 1, 2024, 8:01am

That's very interesting, and only increases my desire to see more of this potential future model. I recall that a long time ago, @dabrahams had a sketch of how segmented collections might be retrofitted on the current Collection protocol.

I have a lot of questions about the programming model and whether this could be an opportunity to introduce generator functions in to the language, but that's probably getting a bit too far beyond the scope of this proposal. Suffice to say, if Span is really going to be such a fundamental building-block - well, that completely changes how I look at the proposal.

dabrahams · July 1, 2024, 8:49am

That was a Swift realization of the work done ~25 years ago by Matt Austern in his paper Segmented Iterators and Hierarchical Algorithms. I've always thought that approach was incredibly promising. It goes well beyond dealing with contiguity to recognize the full hierarchical nature of many data structures (it even works for imposed hierarchy, as in blocked matrix multiplication). Haven't had a chance to look at this proposal but I really hope it's not too late to incorporate Matt's ideas. Any model that force-flattens hierarchical structure is going to suffer performance loss, and the leaves of a hierarchy aren't always contiguous.

lorentey · July 1, 2024, 6:32pm

I am of course aware of Austern's paper; it has seeded my mind with Ideas ever since you linked to it on these fora a decade or so ago. But it isn't my goal to faithfully port this work to Swift; my goal is to pervert it into something that aligns with the goals of Swift's roadmap for improving performance predictability -- i.e., the work that this effort is a part of.

I am taking the idea of a hierarchical iterator, and I'm deliberately misunderstanding it to mean exactly two levels, with simple contiguous buffers at the bottom. The problem at hand is a vexing performance issue; the solution I'm exploring is to significantly reduce abstraction, not to introduce more.

Sorry to disappoint!

(The pitch at hand is not about this though -- it is about the representation of those buffers.)

ajo · July 1, 2024, 10:17pm

This is really, really great!

Was about to comment on the need to avoid the bounds checking for performance - but that was covered too (was only missing it for Cursor under Future Directions).

var storage: Span

I hope will be changed to

var span: Span.

Since we are talking performance, I must ask how this will play with the SIMD-types? SIMD3 is still loaded as a SIMD4 from memory, right?

Also thoughts on endianness support in the API? Though that can easily and performantly be solved by a wrapper, but for a future Cursor it should probably be there as some file types/network protocols are still big or either endian (e.g TIFFs).

Minor things! This is all really good and well polished. Very excited to see it move forward.

glessard · July 1, 2024, 10:51pm

Endianness support is built in to the binary integer types:

  var r = UnsafeMutableRawPointer.allocate(byteCount: 8, alignment: 8)
  r.initializeMemory(as: Int.self, to: 1.bigEndian)
  r.load(as: Int.self)
// Int = 72057594037927936
  r.load(as: Int.self).bigEndian
// Int = 1

It's already here, but a proposal centered around parsing will probably have something more obvious to reach for.

The expectation is that Spans from SIMD types will force a copy to the stack so that there is an address to use. Obviously this will not be the efficient way to use SIMD types, but it is a way to have them interoperate with a variety of other code via Span.

dabrahams · July 2, 2024, 7:15am

I'm not sure you're doing that. AFAICT, you're adding new abstractions; there's almost no proposal that can take abstraction away. And obviously some specialization for contiguous storage at the leaves of hierarchy is needed. The question is, are you actually gaining (or losing) anything by being less general about hierarchy than you would without “perversion?”

glessard · July 3, 2024, 11:15pm

I wanted to point to revisions to the proposal document including, besides copy-editing, a revision to RawSpan's byte-loading API naming. These API now include the "unsafe" particle (unsafeLoad(as:), unsafeLoadUnaligned(as:), and unsafeView(as:),) with better documentation of requirements. We added a future directions section about a new layout constraint ensuring safe loading, or for a new pattern for the validation of loaded values.

The current state of the document is here, and the changes since the original post above can be seen here

oscbyspro · July 16, 2024, 9:30am

I'm echoing the sentiment that Span would be better served by a simple indices getter than assert and validate utility methods, especially given that it uses Int indices and none of the more exotic options discussed in the proposal document. I imagine something like this, and it already exists with an underscore:

extension Span {
    @inlinable public var indices: Range<Int> {
        Range(uncheckedBounds: (0, count))
    }
}

Unchecked range expressions are verbose, so it's a nice convenience. Moreover, it composes well, and index validation translates directly to ~= and Range/contains(_:):

A) span.indices ~= offset
B) span.indices.contains(offset)
C) span.validateBounds(offset)

Additionally, if range validation is important enough to warrant dedicated Span methods, perhaps a Range/contains(_:) extension is more versatile.

A) span.indices ~= offsets        // does not yet exist
B) span.indices.contains(offsets) // Range/contains(_:) should be O(1), where does the O(n) method even come from?
C) span.validateBounds(offsets)   // this is awkwardly limited to Span :(

Maybe there's an appropriate generic constraint that can house range validation? I tried to think of one, but I just ended up wishing that RangeExpression required some basic set operations.

glessard · July 16, 2024, 4:10pm

We are not including an indices computed property because we don't want to inadvertently constrain the design space for the future non-copyable collections work. In fact, the prototype has an underscored _indices property defined exactly as above.

A proper equivalent (and also for-loop support) will come as part of a future proposal.

We do need more conveniences on Range, as well as fix its current inconsistencies. The slow Range.contains(_:) method we currently have is an inadvertent mistake from the Regex additions, where it is defined as a general Collection extension.

benrimmington · July 16, 2024, 4:54pm

The proposal (and implementation) gives the extracting(droppingFirst:) methods a default argument of k: Int = 1.

The proposal (but not the implementation) also gives the four Span.extracting convenience methods a borrowing modifier.

glessard · July 16, 2024, 8:29pm

Thanks for pointing these out. The default parameter is left over from the function's previous life as droppingFirst(_ k: Int = 0) (it was the same implementation). The borrowing modifiers are the default for functions of non-escapable types, so they don't need to be stated. ~~The fix is upcoming~~ This has been fixed in the proposal and the implementation.