The ContiguousBytes protocol is there to describe types that can vend storage into their underlying bytes, such as unsafe buffer pointers, arrays, and Data. However, the current design of ContiguousBytes does not work with the Span family of types, because it assumes that the Self type is both Copyable and Escapable. This proposal generalizes ContiguousBytes to support non-copyable and non-escapable types, makes InlineArray and the various Span types conform to it, and provides a safe counterpart to the withUnsafeBytes requirement of ContiguousBytes.
I propose updating ContiguousBytes to support Span et al by changing its definition to this:
public protocol ContiguousBytes: ~Escapable, ~Copyable {
/// Calls the given closure with the contents of underlying storage.
///
/// - note: Calling `withUnsafeBytes` multiple times does not guarantee that
/// the same buffer pointer will be passed in every time.
/// - warning: The buffer argument to the body should not be stored or used
/// outside of the lifetime of the call to the closure.
func withUnsafeBytes<R>(_ body: (UnsafeRawBufferPointer) throws -> R) rethrows -> R
/// Calls the given closure with the contents of underlying storage.
///
/// - note: Calling `withBytes` multiple times does not guarantee that
/// the same span will be passed in every time.
func withBytes<R, E>(_ body: (RawSpan) throws(E) -> R) throws(E) -> R
}
... and then making the Span, MutableSpan, RawSpan, MutableRawSpan, UTF8Span, and InlineArray types conform to it.
Seems straightforwardly good. It occurs to me that we should probably figure out some sort of overarching strategy for this and withContiguousStorageIfAvailable in terms of which one is “preferred” (i.e. checked-for first) in situations where we’re dynamically discovering which fast paths are available.
Thanks for thinking through this! Overall, this looks like a good direction to me in order to make the existing ContiguousBytes protocol better in a world where we’re moving towards span types. I think that there’s still discussion to be had that the span proposals alluded to around what new APIs should be using to accept bytes as input (since ContiguousBytes has its issues as well and we may want to move to a world where APIs just use RawSpan directly / there is some automatic conversion to RawSpan). That being said, I don’t think that precludes us from improving ContiguousBytes via your proposal here to help in cases where ContiguousBytes is already used today. Overall just a few general questions:
Availability
Will these new declarations (either the requirement or the default implementation come with availability)? I assume the default implementation we can define as @_alwaysEmitIntoClient and therefore wouldn’t come with availability, but what about the protocol requirement - because the default implementation is always available will that not have availability either, or are there availability constraints on Darwin platforms to keep in mind for adoption here?
withUnsafeBytes Typed Throws Adoption
Your implementation PR also updates the ContiguousBytes.withUnsafeBytes requirement to adopt typed throws, but I don’t think I see that mentioned in the proposal. Just to clarify, is that also a proposed change here, and are there any source or ABI compatibility impacts when doing so that we should be aware of?
OutputSpan
Should we also be adding a conformance to Output(Raw)Span? It’s simple enough to take an output span and call .span to get a Span which would conform, but the same argument can be said for types like InlineArray and UTF8Span which are gaining a conformance here, so would it be consistent to include OutputSpan?
I have a minor nit with the generic signature for withBytes() especially if it is to be codified in ABI: we should make sure that the order of the parameters in the angle brackets is the same as the order in the declaration. This has prevented us from updating the syntax of some stdlib functions to use anonymous generic types (some Protocol) in the past. I have no idea what syntax updates could occur here, but let’s not let this mistake occur again!
I believe we should actually take this opportunity to move the protocol to the standard library. It is fundamentally valuable to be able to tell in advance if a particular sequence (or other value) has contiguous storage, which enables various fast paths. Having the protocol up in Foundation forces a dependency on a rather large library.
The Span, MutableSpan, and InlineArray types will conditionally conform to ContiguousBytes when the element type is UInt8, just like Array and Unsafe(Mutable)BufferPointer already do:
Should the new (and existing) conformances use Element: BitwiseCopyable instead? (Trivial is mentioned in some FIXME comments.)
The protocol requirement needs to have availability constraints for the Swift 6.3 runtime (or wherever this lands), as do the new conformances of Span et al to ContiguousBytes. Everything else follows the availability of the types, i.e., Swift 6.2-aligned for InlineArray and Swift 5.1-aligned for the back-deployment targets of Span et al.
The pull request only does this for Embedded Swift, where we need to use typed throws. I didn't think we needed to cover that in the proposal.
I had forgotten that OutputSpan had a span property for the already-initialized values. I'll add it.
I don't think this applies to any of the generic parameters I've added; they all need to be referenced in multiple places and none of the protocols here have primary associated types.
It's also that some types that conform to ContiguousBytes couldn't provide a borrowing bytes property even if we could express it now. See the first section of Alternatives Considered for more.
I agree that having to pull in Foundation(Essentials) to get this protocol and its conformances is a fairly heavy dependency. However, I think this protocol might be an evolutionary dead-end, and that more APIs should take RawSpan directly rather than a ContiguousBytes-conforming protocol.
They can't, because BitwiseCopyable is a marker protocol and one cannot define a conditional conformance whose requirements involve a marker protocol (because we could not match them at runtime).
I suppose that the concrete withBytes I've been adding to all of these types could be use the more-general BitwiseCopyable constraint, although I don't know if that buys us all that much when the span property is right there.
(On that note, is there a generic way to express “this type has a bytes: RawSpan or a span: Span<Element> property? Perhaps that's the protocol we need here?)