Re-pitch: `ContiguousCollection`

Hi everyone,

Here's a re-pitch of a deferred part of SE-0237, recommending adding protocols that capture the ability to guarantee access to an underlying buffer.


Introduce {Mutable}ContiguousCollection protocol

  • Proposal: SE-NNNN
  • Authors: Ben Cohen
  • Review Manager: TBD
  • Status: Awaiting implementation
  • Previous Proposal: SE-0237

Introduction

This proposal introduces two new protocols: ContiguousCollection, which refines Collection, and MutableContiguousCollection, which refines MutableCollection. Both provide guaranteed access to an underlying unsafe buffer.

Motivation

SE-0237 introduced two new methods, withContiguous{Mutable}StorageIfAvailable, to allow generic algorithms to test for and use an underlying unsafe buffer when it is available. This has significant performance benefit for certain algorithms, such as sort, which can achieve big speedups when they have access to the unsafe buffer, but can still operate without that fast pathif needed.

There is another class of operation that only wants to be available when there is a fast path. A good example would be a Swift-friendly wrapper for the vDSP suite of algorithms.

For example, you might want to write a convenient wrapper for vDSP_vadd.

// note this is **not** a proposal about vDSP wrappers, this is just a
// simplified example :)
func dspAdd<A: Collection, B: Collection>(
  _ a: A, _ b: B, _ result: inout [Float]
) where A.Element == Float, B.Element == Float {
  let n = a.count
  // try accessing contiguous underlying buffers:
  let wasContiguous: ()?? =
    a.withContiguousStorageIfAvailable { abuf in
      b.withContiguousStorageIfAvailable { bbuf in
        vDSP_vadd(abuf.baseAddress!, 1, bbuf.baseAddress!, 1, &result, 1, UInt(n))
      }
  }
  // if they weren't contiguous, create two arrays try again
  if wasContiguous == nil || wasContiguous! == nil {
    dspAdd(Array(a), Array(b), &result)
  }
}

This follows a similar pattern to sort: provide a fast path when available, but fall back to a slower path when it isn't.

But in the case of functions like vDSP_vsaddi this is very much the wrong thing to do. These functions often operate on a thin (but very material) performance edge over their open-coded equivalent, and allocating and initializing two arrays purely to be able to call it would probably vastly outweigh the speed benefits gained by using the function instead of a regular loop. This encourages misuse by the caller, who might not realize they are getting worse performance than if they reorganized their code.

Trapping on non-contiguous inputs would flag the problem more clearly to the user. In the case of the "return" buffer argument, if we wanted to make that generic, trapping is the only option as we cannot just convert that type to an array and rerun. But ideally we would use Swift's type system to enforce this
instead, guiding the user to a better solution.

Proposed solution

Introduce two new protocols which guarantee access to a contiguous underlying buffer.

/// A collection that supports access to its underlying contiguous storage.
public protocol ContiguousCollection: Collection
where SubSequence: ContiguousCollection {
  /// Calls a closure with a pointer to the array's contiguous storage.
  func withUnsafeBufferPointer<R>(
    _ body: (UnsafeBufferPointer<Element>) throws -> R
  ) rethrows -> R
}

/// A collection that supports mutable access to its underlying contiguous
/// storage.
public protocol MutableContiguousCollection: ContiguousCollection, MutableCollection
where SubSequence: MutableContiguousCollection {
  /// Calls the given closure with a pointer to the array's mutable contiguous
  /// storage.
  mutating func withUnsafeMutableBufferPointer<R>(
    _ body: (inout UnsafeMutableBufferPointer<Element>) throws -> R
  ) rethrows -> R
}

Conformances will be added for the following types:

  • Array, ArraySlice and ContiguousArray will conform to MutableContiguousCollection
  • UnsafeBufferPointer will conform to ContiguousCollection
  • UnsafeMutableBufferPointer will conform to MutableContiguousCollection
  • Slice will conditionally conform:
    • to ContiguousCollection where Base: ContiguousCollection
    • to MutableContiguousCollection where Base: MutableContiguousCollection

Detailed design

The introduction of these protocols allows an appropriate constraint that would prevent a user passing a Range or Repeating collection into our dspAdd function. It also allows an easy path to a generic result buffer instead of a concrete array; this is important as often these functions are used in a tiled mode where you would want to repeatedly pass in an array slice. As a nice side-benefit, it also cleans up the function implementation:

func dspAdd<A: ContiguousCollection, B: ContiguousCollection, R: MutableContiguousCollection>(
  _ a: A, _ b: B, _ result: inout R
) where A.Element == Float, B.Element == Float, R.Element == Float {
  let n = a.count
  a.withUnsafeBufferPointer { abuf in
    b.withUnsafeBufferPointer { bbuf in
      result.withUnsafeMutableBufferPointer { rbuf in
        vDSP_vadd(abuf.baseAddress!, 1, bbuf.baseAddress!, 1, rbuf.baseAddress!, 1, UInt(n))
      }
    }
  }
}

Source compatibility

These are additive changes and do not affect source compatibility.

Effect on ABI stability

These are additive changes of new protocols and so can be introduced in an ABI-stable way. On platforms that have declared ABI stability, they will need to have availability annotations.

Effect on API resilience

N/A

Alternatives considered

This is a re-pitch of these protocols. They originally appeared in SE-0237, but were deferred pending further use cases. This proposal is motivated by such cases.

8 Likes

3 posts were split to a new topic: OS versus Swift availability

+1 to the idea.

There's also a significant issue with SE-0237 that might be worth addressing at the same time. withContiguousStorageIfAvailable only offers typed pointer access, which means Unsafe(Mutable)RawBufferPointer and Foundation.Data can't implement it. We should include a withContiguousBytesIfAvailable function on Sequence to address that. See discussion on this PR.

Similarly, is it worth making ContiguousCollection inherit from a parent ContiguousBytes protocol for URBP/Foundation.Data/SwiftNIO.ByteBufferView/etc support?

1 Like

ContiguousBytes and withContiguousBytesIfAvailable are worth considering, but in a different pitch. They have very different use cases.

If we had a ContiguousBytes protocol, wouldn't it make sense for ContiguousCollection to refine it and implement it by providing an untyped view of the buffer? Adding a parent protocol is an ABI-breaking change IIRC, so if any Apple platforms ship with this protocol we could never add the refinement.

I'm just saying that since we added withContiguousStorageIfAvailable, my experience is that many important types can't support it without significant rewriting (if at all) because they use raw pointers, and if we started using it in the standard library, those types would miss important fast-paths. We shouldn't make the same mistake of ignoring raw pointers this time, too.

Even though it’s possible to make ContiguousCollection refine ContiguousBytes, it isn’t especially compelling to. Anything providing an unsafe buffer can also trivially provide unsafe bytes at no runtime cost, so adding witness tables to dynamically dispatch to such a method isn't necessary, and would come at the cost of code size and compile times.

1 Like

Along these lines, when ContiguousBytes / RawContiguousCollection is introduced, ContiguousCollection could have a default conformance and (statically dispatched) implementation. Anything that can provides a UBP<Element> can provide a URBP. CC @Andrew_Trick in case I got this wrong.

Unless this is semantically wrong, I agree with @Ben_Cohen that it wouldn't deserve a witness table entry.

I didn't see it explicitly mentioned. Is the intention that conformers to ContiguousCollection can get a default implementation of withContiguousStorageIfAvailable, overriding the current one that returns nil?

Good point, yes. Have update the proposal to make this explicit.

ContiguousCollection could conditionally conform to ContiguousBytes where Element is trivial, providing a default implementation. That isn't the issue though. I agree with @Karl that forcing ContiguousCollections to vend typed pointers makes it incompatible with many important "contiguous collections". Data and any of the various byte buffer types built on top of a raw pointer will never be able to conform. More generally, this won't work for any contiguous collection that supports reinterpreted pointers. If I have a collection that is a "view" of Element over some foreign storage, it can't conform to ContiguousCollection<Element>.

On the other hand, vending a typed pointer is the only way to enable seemless interop with C APIs. If the sole purpose of this protocol is allow passing a collection to a C API, then it's probably the right design. (If it's supposed to serve some other purpose then I think it's the wrong long term design).

That begs the question, how do I pass a byte buffer to such a C API?

  • if the C API takes void * it just works

  • if the C API (incorrectly) takes char *, create a C wrapper over
    it that takes void *.

  • if the C API actually performs typed operations on the pointer (float *),
    then we need a wrapper type that "owns" a unique instance of
    the buffer and guarantees that no other typed pointers can be
    exposed. That wrapper type can internally
    bindMemory/assumeMemorybound, and it can conform to
    ContiguousCollection<T>.

3 Likes

The motivation in the pitch was dspAdd. Do you think this pitch's approach for dspAdd is the wrong long term design?

edit: grammar

I think this is the best way to drop down to a concrete pointer type, e.g. to call vDSP_vadd. When you need to share pointers with C, I don't see a better way.

If, on the other hand, you only called intrinsics that took Swift SIMD value types, then the memory operations could be safely abstracted, either behind an abstract Pointer or through a concrete ReinterpretedPointer that always uses load(fromByteOffset:as:). In that case, the only need for something like the ContiguousCollection protocol would be to give you guaranteed fast pointer arithmetic (even without full specialization).

My prior comment was meant as a caution not think if the proposed ContiguousCollection as a fully general collection with contiguous storage.

Oh yeah, I just remembered that Foundation already has a ContiguousBytes protocol.

The reason why I think this is on-topic is because we appear to have several, disjoint abstractions for expressing contiguous storage. How should we teach this?

  • For 3rd-party types: which protocols/customisation points should you implement?
  • For 3rd-party algorithms: if you can benefit from contiguous storage access, what should your generic/existential constraints be?

I'm not sure how to answer those questions, but I think ContiguousBytes is too fundamental to live in Foundation. It is very likely to be duplicated by the standard library at some point.

2 Likes