[Pitch] Safe Access to Contiguous Storage

still better than the full name of, say, UMRBP.

Very nice!

Minor notes follow below.

  • In the ContiguousStorage protocol, I recommend dropping the View suffix from the name of the storageView property:

    public protocol ContiguousStorage<Element>: ~Escapable {
      associatedtype Element: ~Copyable & ~Escapable
    
      var storage: borrow(self) StorageView<Element> { _read }
    }
    

    This would be a better fit with established practice in the stdlib for such properties (e.g., String.utf8, String.unicodeScalars etc.), and it helps reduce unnecessary verbosity at the point of use.

  • I suggest moving StorageViewIndex to a nested type, StorageView.Index. Given that the index is also generic over the same Element, it seems preferable to have only one way to spell it. Nesting the type under StorageView will also help keep the top-level namespace clean of auxiliary constructs.

  • Similarly, StorageViewIterator ought to be defined directly as StorageView.Iterator. (But see below.)

  • The attributes of borrowing iterators are not defined yet; however, I think it is unlikely we'd want to reuse the name Iterator for borrowing iteration, as it would lead to clashes. (The existing Iterator name in Sequence is semantically a consuming iterator, which cannot be implemented by this construct.) Additionally, I suspect we may not want the iterator to be (implicitly) copyable.

  • The MyResilientType example uses a withStorageView method that does not exist. More importantly, it may be a good idea to recommend making the core interface public (as opposed to @usableFromInline):

    extension MyResilientType {
      // public API
      @inlinable
      public func essentialFunction(_ a: some ContiguousStorage<Element>) -> Int {
        self.essentialFunction(self.storage)
      }
    
      // ABI boundary
      public func essentialFunction(_ a: StorageView<Element>) -> Int {
        ...
      }
    }
    

    When applied to HypotheticalBase64Decoder, this leads to the following:

    extension HypotheticalBase64Decoder {
      @inlinable
      public func decode(bytes: some ContiguousStorage<UInt8>) -> [UInt8] {
        decode(bytes: bytes.storage)
      }
    
      public func decode(bytes: StorageView<UInt8>) -> [UInt8] {
        ...
      }
    }
    

    While this does increase the API surface (so it may not be appropriate in all situations), but it is generally a good idea to expose the underlying primitives as directly as possible. (So that the API docs highlight the direct path, and so as to avoid forcing clients to go through multiple layers of unnecessary abstraction when they just want to pass a direct StorageView instance.)

  • The withUnsafeBufferPointer escape hatch that gets scoped access to an UBP has far more utility than simply enabling C interop -- I think it should be fully embraced as primary API. (It pairs with and goes hand in hand with the unsafe initializers.)

  • We'll probably want StorageView to implement this missing method from the existing Collection protocol:

    public func index(_ i: Index, offsetBy distance: Int, limitedBy limit: Index) -> Index?
    

    This method has some design problems, and it isn't at all critical for types with strideable indices, but if we provide the rest of the index navigation methods, it seems weird to omit this one.

5 Likes

The reason I suggested "Array" was actually to avoid linking those types :sweat_smile:

I don't terribly mind it, but we should consider carefully whether we want the names to suggest a relationship to unsafe buffers, or whether we would rather encourage a conceptual relationship to Array.

This is effectively the safe replacement for (most uses of) UnsafeBufferPointer, so the name seems appropriate to me.

4 Likes

Yeah I'm not suggesting that it's inappropriate, I'm suggesting that there are multiple suitable candidates.

For instance, what is an Array in Swift? It's a safe, owned buffer with value semantics. This would be a safe, borrowed buffer (and I believe the intention is that we use exclusivity to suppress the spooky-modifications-at-a-distance which are the hallmark of reference semantics). So BorrowedArray also wouldn't be inappropriate IMO.

The thread about enabling bounds-checking in UBP has shown that some people habitually drop the "unsafe" prefix when discussing that type (and just call it a "buffer pointer"), leading to severe misunderstandings about how that type behaves. That's why I'm apprehensive about calling anything else a "buffer".

2 Likes

For me, using Array in the type name would hit the ear wrong for a couple reasons:

  • If I'm getting it from something that isn't explicitly an Array, then invoking the term "array" feels misleading. Today, there are places in Swift where—because something takes or returns an Array specifically—you end up making copies of things to fit it into a certain API. If I go from a String.UTF8View to a BorrowedArray, a reader might think "why is an 'array' involved here; did it somewhere make a copy of the UTF8View into an Array and I'm borrowing that?" On the other hand, BorrowedBuffer/StorageView/BufferView all feel more clearly like they're providing a view over that underlying storage directly.

  • If I'm working specifically with Array types, then BorrowedArray feels sort of too similar to ArraySlice, which I think could also lead to confusion about which one to use in each circumstance.

10 Likes

+1, that leads my mind in the right direction conceptually too.

A separate reflection is that perhaps the swift programming language book would need to get appendixes over time on concepts such as memory ownership etc more in depth, there seems to be a little tension now between keeping approachable to new users while covering “everything” - looking at memory ownership, concurrency details, macros and more - I often end up with a hunt across various sources now (pitch threads, actual proposals, discussion about the proposals, etc - as there often are critical nuggets sprinkled around…). Looking forward to when the LLM:s catch up with the latest discussions…. :slight_smile:

3 Likes

Related to this point, it's worth considering whether we might want "borrowed" versions of other standard library types.

For instance, a recent pitch suggested that we might add a type called Unicode.UTF8.ValidBufferView for text processing over non-owned storage. What if we instead called that a BorrowedString?

There would be nice some symmetry to having both BorrowedArray and BorrowedString, and I think having that as a more established pattern could minimise some of your concerns.

Anyway, I don't want to labour the point too much because my actual opinion is that BorrowedBuffer is also fine. But it's worth thinking about these things.

3 Likes
protocol ContiguousStorage<Element>: ~Escapable

So a noncopyable type could not conform to ContiguousStorage? Is this a necessary restriction?

Also, this pitch and its sister proposal seem to imply a lifting of restrictions around generics and suppressed constraints.

1 Like

Yes, there's a forthcoming proposal that covers that.

3 Likes

No, this should also allow noncopyable types. I forgot to include the annotation in the pitch. Thanks for pointing it out!

3 Likes