[Pitch] Noncopyable Standard Library Primitives

Reading buffer.extracting(5..<10) to mean "copying or moving of (potentially noncopyable!) elements at positions 5, 6, 7, 8, 9 into some new storage, conjured into existence in some vague, unspecified way" is certainly one possible way to interpret this call.

However, I actually think this is close enough to the actual meaning that to me it confirms that this is a good name for this operation, and it will not cause any real confusion.

I don't think anyone will be surprised when they find out what extracting actually means. (When I encounter a new API, I do this "finding out" part by quickly looking up its API docs.)

To be honest, I don't think it is at all sensible to expect that API names will eliminate all room for misinterpretation. Absolutely every name can be and will be misunderstood by somebody.

For example, the name "slice" is similarly misleading, as we could've applied it equally well to a myriad different constructs. But I don't think that matters one bit -- Slice is nevertheless an eminently wonderful name for our specific, copyable "view into a subsequence of elements of another collection" construct.

Of course, an important goal of these pitches is to explore alternatives to the proposed names, and to argue about the semantics of each operation, including whether or not we need to provide it at all.

But can we please not fight against having more names like Slice?

Even longer, poorly edited rant on this subject

The whole idea that names need to precisely explain the concepts that they're labeling feels absurd to me. It feels like it's a twisted misreading of Swift's API Design Guidelines; I think it's a menace and I believe we need to stop pretending it's an actual requirement.

All it achieves is that it encourages labeling things with definitional/explanatory phrases, which (in my view) is an extremely poor naming scheme for concepts that we will need to routinely use and mention.

I'm wearing a pair of things we call shoe; we aren't calling them coveringForTheFootWithASturdySoleNotReachingAboveTheAnkle.

Good names are easy to remember! Good names roll off the tongue! Good names do not make any attempt to define what they are labeling -- they merely vaguely gesticulate towards the general direction of the thing. (I consider String and Thread to be top-notch names, but they provide precious little (if any) actual indication of the constructs they're symbols for. The name "byte" is even better; it is brave enough to get weird, to great success.)

There is this thing called "learning" that humans are apparently pretty good at. If I encounter the word "gigue" in an article, the context may give me some vague clue about what it means. If I want to be sure, I simply look it up in a dictionary. Afterwards, there is a good chance I'll remember it, and I won't need to look it up every time I see it.

</rant>

With that off my chest, concrete suggestions for alternatives are of course very welcome. It is also fair to suggest that some of the operations aren't necessary.

✻ ✻ ✻

For what it's worth, I chose "extract" as the root of the name here simply because it was the most obvious name I could think of. This is the label I've always been using for operations that build a new container out of parts of another, especially another of the same type. (Random examples: _NativeSet.extractSubset, Rope.extract) Using the word "extract" in this sense is very familiar to me. Do y'all have a more obvious label for this operation? (Apologies, but I don't think sliceAndRebase works.)

All too often, I find myself having to construct an Unsafe[Mutable]BufferPointer from a range of items inside a larger buffer, and I always felt that UnsafeMutableBufferPointer(rebasing: foo[i..<j]) was way too clumsy a spelling for it.

Given that I do not expect that it will be possible to generalize U[M]BP's existing slicing subscript for noncopyable elements*, I jumped on the opportunity to provide a direct operation for this very common operation. Unlike slicing, this operation does make sense for buffers of noncopyables, so given its universal usefulness, it seems worthy of a good name.

Footnote on the impossibility of slicing containers of noncopyables

The concept of a "collection slice", as embodied in the standard Slice type and (in particular) the elegant slicing notation foo[i..<j] is heavily relying on copyability: the slice physically owns a full copy of the collection, and it uses this copy to implement both read-only and mutating accesses, all within a single type.

var array = Array(0 ..< 20)
print(array[0 ..< 10]) // borrowing-style read-only access
array[0 ..< 10].sort() // ostensibly in-place mutation (actually not!)
array[0 ..< 10] = array[5 ..< 15] // overlapping range assignment

This universal form of slicing will not carry over to containers of noncopyable elements, at least not without a complete overhaul and/or reduction of its expressive power.

For noncopyable types, Slice does not seem to be a viable abstraction. (To be precise, the Slice type could in theory be generalized into a consuming noncopyable slicing construct. I strongly suspect we will not do that -- if we decide that the idea of a consuming slice has merit, it would be a better idea to build it from scratch, free from nightmarish ABI/source compatibility issues. We will know better soon, once we become ready to talk about noncopyable container abstractions.)

We aren't ready to talk about which aspects of slicing will merit figuring out a noncopyable generic solution. FWIW, I suspect it would be desirable to at least implement a reusable/generic borrowing slice construct.

</footnote>

Omitting this operation altogether is an idea worth considering, too, despite its usefulness.

Over the years, UnsafeBufferPointer has crept into a bunch of public API, from Collection.withContiguousStorageIfAvailable to Array(unsafeUninitializedCapacity:initializingWith:), and this has effectively turned it into a sort of universal interface type that it has no right to be.

In the foreseeable future, we are planning to introduce a small family of non-escapable Span (née StorageView) constructs that we expect will gradually evict UnsafeBufferPointer into the darkest, deepest layers of data structure implementations and similar low-level contexts, where it belongs. I think it would be somewhat rude to deprive these layers of a standard way of extracting sub-buffers -- but it would not be the end of the world.

The copyable case is not a deprecated afterthought -- it remains (and I expect it will indefinitely remain) the primary programming model in Swift, at least in the higher-level layers.

Therefore, I don't think using the foo[i ..< j] notation for the extracting operation would be a viable option. That spelling is reserved for the existing slicing subscript; in Unsafe[Mutable]BufferPointer, this subscript must continue to return a copyable Slice. We cannot mess with that. Existing code is sacred.

But a labeled subscript like foo[extracting: i..<j] would perhaps be a viable option. It is subtly hinting at capabilities that this operation doesn't really provide, which is why I suggested a member function for it.

Well, that attribute has a tendency to turn previously working code into a pile of "the compiler is unable to type-check this expression in reasonable time" errors. It is also not a principled solution: it is more of an emergency patch, and it remains underscored for a reason.

But, assuming it would technically work in this case, I suppose the idea is that we'd prefer the existing copyable subscript, and mark the new subscript implementation with this attribute.

What type would this new subscript operation return?

Insisting on preserving the foo[5 ..< 10] syntax would mean that we'd need to implement index sharing in the noncopyable case, too: the first item of the resulting slice would need to be positioned at index 5. (This is not very negotiable: we cannot have the same notation on the same type mean completely different indexing behavior based on whether or not Element is statically known to be copyable. Element may even be a conditionally copyable type, like Optional or Result!)

There is no existing type we could use to implement such indexing; consequently, we'd need to introduce a pair of new types specifically dedicated to representing slices of a noncopyable buffer pointer.

We'd also end up having to duplicate all the slice APIs from SE-0370. That's an immense number of new APIs, only to get back to the original, clumsy UnsafeBufferPointer(rebasing: foo[5..<10]) spelling!

I don't think this one operation is worth all that, however important it is.

Beware, unsafelyUnwrapped is not the standard force-unwrapping operation -- this is the obscure, unsafe variant of it that does not check that the optional actually contains a value. (Triggering undefined behavior if the value isn't there, rather than a guaranteed trap.)

The actual (safe) force-unwrapping operation is the special form x!, which is hardwired into the language. As of today, it always consumes the optional, but that is a (hopefully temporary) bug -- it is intended to either consume or borrow the optional, depending on usage context.

There is a possibility that we will be able to define properties that similarly support consuming and borrowing access in the future, thereby allowing us to implement this same adaptive behavior using just the existing unsafelyUnwrapped property.

Accordingly, I have removed the addition of an Optional.unsafeUnwrap() function from the pitch, and I added a passage in the Future Work section about this todo item. Thanks for pressing on this!

4 Likes