There hasn’t been a vision document about the container protocols. There is a draft in the “future” branch of the swift-collections repository.
I prefer storage
to elements
, because storage
is more specific about what you're getting: a handle to the underlying storage. The less-specific elements
could just mean "give me a collection of all of the elements".
I love this proposal as-is
Doug
Per the adoption of Data
; the type originally had a typed memory binding to UInt8
since its providence was that of NSData
. However, since in the evolution of Swift happened after the introduction of Data
that behavior was then changed to favor the un-typed bag of bytes (and not per-se bound UInt8's). This means that we ideally should only offer the raw accessors to promote Data's un-bound nature.
So +1 for the raw var bytes: RawSpawn { get }
but -1 for the var storage: Span<UInt8>
iiuc the differential correctly.
Makes sense. I will make that change.
I presume that if we are incorrect in that case we could offer it later with little effort. Right?
Yeah. Even if we don't it will remain really easy to go to a Span<UInt8>
from the bytes, by virtue of UInt8
being BitwiseCopyable
. It would be one of the types that would be BitwiseLoadable
if we make that layout constraint happen.
Perhaps some combination of the words? storedElements
?
I think we should take a step back and think about how the protocols look like and what semantics they have as @lukasa suggested. Even if we don't introduce it just yet because of current compiler limitations, it is still useful to do the thought exercise and see if the semantics and names are a good fit.
I think this is why it shouldn't be called storage
because you don't get a handle to the full underlying storage, but instead a potential subset of the underlying storage.
This is the case for Array
as it only gives you access to its initialized storage.
Data
is a potential slice of a larger Data
and would only give access to a sub sequence of the full storage. In fact, this is the case for most SubSequence
s e.g. ArraySlice
and the various types wrapped in Slice<>
.
The term storage is often used as an implementation detail that is usually not exposed and is not a 1:1 mapping to what is proposed here. I think elements
is more fitting, better than storage but I don't love elements
either just yet.
Can we not just call this .span
?
Yeah, but that’s just repeating the type signature; it doesn’t say what it is, just how it’s presented.
It says exactly what you're getting from the API. It may be repeating the type signature, but at the use site you're rarely going to be typing Span
out explicitly anyway:
let s = arr.span
foo(arr.span)
bar(s)
print(s[0])
// And in the not so distant future...
let ms = arr.mutableSpan
ms[0] = 123
ms.shuffle()
It is an Objc-C
style design – NSMutableString, NSMutableArray...
What is the reason to do it in such a way when having mutable
methods in swift?
This is simply about the law of exclusivity. Span
borrows read-only memory and, as such, supports multiple simultaneous read-only accesses. A "mutable borrow" is a writeable access that must be exclusive. The exclusivity requirement can be modeled nicely in the type system with non-copyability. From this, it follows that a mutable borrow must be modeled a with its own type, MutableSpan
.
I’m not weighing in with an opinion here, but I want to point out that “can be modeled” does not imply “must be modeled”.
One could imagine a world where the compiler recognizes the difference between var x: Span
and let x: Span
.
I didn't say "must be modeled", but it definitely "can be modeled". The current compiler cannot do it another way as far as I can tell. The world where a non-owning reference type puts different borrowing requirements whether its binding is let
or var
is neither current nor near-future Swift.
As no doubt you know, we have done some reckoning about what it "means" to use let
versus var
in the context of Swift atomics, so there is some precedent for fiddling with these.
Since we've already contemplated concessions to mark a type as "must-never-be-var
" in that context, extending that design so that a paired type can be marked as "must-never-be-let
" and having those known to the compiler as duals of each other might not be as far-fetched as at first blush.
...if it is wise to do so.
This is just not how these types work. Span
and MutableSpan
are reference types, they are not value types in the general sense (yes, they themselves are values, but they are closer to UnsafeBufferPointer
and UnsafeMutableBufferPointer
than anything else). A type vending a var span: Span { get }
getter is providing read only access to some contiguous piece of memory which may or may not be mutable. Consider a type which gives you a span over some constant memory in the binary; this memory is not mutable whatsoever and if we treated these things like values types (i.e. provide mutable accessors on Span
) then it's just fundamentally incorrect:
// s points to some __TEXT,__const memory
// e.g. [0x0, 0x1, 0x2]
var s = myType.span
s[0] = 123 // NOT OK
Another reason why we need two separate types is that getting a Span
from some type is a read only access i.e. requires a regular { get }
. If Span
provided mutable accessors then this would require a { mutating get }
(because we need to signal to the compiler that mutations on MutableSpan
will directly mutate whatever type/container/parent vended you the MutableSpan
) which is obviously not always available if you don't have a mutable reference to the type vending you the span:
func something(
with arr: borrowing SomeNoncopyableArray<Atomic<Int>>
) {
var s = arr.span
s[0] = Atomic(123) // NOT OK
// what we really want:
let ms = arr.mutableSpan // error: cannot mutate 'arr'
var s = arr.span
s[0] = Atomic(123) // error: Span.subscript is not mutable
}
Here we have a borrowing read access of some hypothetical noncopyable array containing some atomic integers. We do not have exclusive access to this array and thus we cannot mutate it otherwise we will run into undefined behavior. Therefore, we need to distinguish Span
from MutableSpan
because we can provide read only getters (get
) that don't require exclusive access (aka we don't need a var
reference or an inout
reference to some type to access some var span: Span { get }
) which prevents mutation when it is either 1. unwelcome or 2. disallowed completely.
Yeah this is a known performance problem with UnicodeScalar.UTF8View
- that it encodes the entire scalar again each time you read a byte. Unfortunately it's also @frozen
and entirely @inlinable
so we can't easily change that. However, it's trivial to implement a span view by just encoding the scalar once in to a fixed-width integer/stack buffer and yielding a span over it.
It's still a constant-time operation, and doesn't allocate any heap memory, so I don't see any problem at all if we were to encode in the .span
accessor - only huge benefits for users of the type.
I believe the idea is that eventually these spans will form the backbone of a replacement to the Collection
protocol hierarchy. Given that, I think we should take this opportunity to address the flaws in these Collection
conformances.
The UnicodeScalar.UTF16View
is slightly less problematic as the encoding is simpler, but I think it should also get a .span
, implemented in the same way as the UTF8View
.
EDIT: Godbolt comparison of a for
loop over UTF8View vs. eagerly encoding using withUTF8CodeUnits
. The latter generates significantly less code. This is what I'm suggesting the .span
view provide.
Adding a storage
property to UnicodeScalar.UTF8View
would require the storage property to be a coroutine. We would prefer these accessors to be borrowing, and not allow them to allocate temporary storage.