I can't say I love the Rigid
prefix, but putting that aside. Is it expected that Vector
would form the backing inline storage for these types? Wouldn't that necessitate a static func uninitialized() unsafe -> Self
-esk factory function on Vector
?
Is it expected that
Vector
would form the backing inline storage for these types?
No. Rather their inline storage would be built with the same building blocks used for Vector. The fully initialized state is fundamental to Vector
.
Genuinely excited for this proposal to come to Swift. Fully agree with every single point raised in the proposal. It's one of those things that seems so obvious in retrospect, and yet are extremely hard to pull off.
For obvious reasons, we cannot name this type
Swift.Array
to match the
"term of art" that other languages like C, C++, and Rust are using for this
exact type. However, while this name is the de facto for other languages, it
actually mischaracterizes the properties and behaviors of this type considering existing terminology in mathematics.
I like Vector
as a name choice. I opened this thread prepared to say well, this isn't actually a vector... but I ended up being convinced by the arguments in the proposal.
First, this makes "true" mathematical vectors trivial to write: one just needs to define a group operation (typically addition) and scalar multiplication. This can be easily done by extending the type. I think this alone already is a strong basis for naming this data structure Vector
. Other programming languages already use vector as a name for some stuff that is even more detached from the concept of a mathematical vector (C/C++/Rust).
It's true that for certain types it wouldn't be possible to define a group operation and a field over which Vector<N,T>
becomes a true mathematical vector, but I think that's largely irrelevant. When T
is not numeric, the mathematical definition of vector is likely not a consideration at all and developers will understand that Vector<N,T>
is something shaped like the analogous (mathematical) vectors but with none of the properties.
In mathematical usage, a 'vector' is just a thing with certain algebraic properties (namely, addition and scalar multiplication). It has nothing to do with being a fixed-length list of elements.
While vectors of any length can fulfill the mathematical definition of a vector, that's not very useful in a vacuum. Addition and scalar multiplication define a vector space, and this vector space is a set of vectors of a fixed size. One rarely operates with vectors from different vector spaces at the same time.
So Vector<N,T>
would be an element from the vector space defined by vectors of length N
and type T
.
If T
is numerical, then regardless how the operations are defined, all the potential generated vector spaces are isomorphic, and equal to the set of all allowed values of Vector<N,T>
(for a fixed N
). So one could think of Vector<N,T>
as a vector from a vector space for which we don't know the defining operations.
I wonder if this would be great for embedded? I noticed that Array was a heap allocating type so it couldn't be used with -no-allocations. So would Vector then be a non-heap allocating type and maybe live somewhere on the stack or get pulled out of .text?
I'm pretty excited about this pitch too! Like a lot of people here, the general concept is so much needed and welcome.
The name Vector
for the proposed type is very unfortunate and IMO needs to be revisited. I favour names like Buffer
or even FixedSizeArray
. Just because other languages have made the mistake doesn't mean we have to go down that road.
IMO, Swift got the name Array
right for a dynamic container! An array is an ordered container for any type. There are no restrictions on the type as long as the container has the same type of elements (It doesn't have any invariants pertaining to mathematical meaning). As far as I know, the proposed type is a fixed-size array and has nothing to do with the mathematical term vector (more on this below). The proposed type can hold any non-copyable type, so it is not intended to be used only for numeric types.
Indeed, the
std::vector
type goes against the definition of vector by being a
growable container, having a non-fixed magnitude.
We fully acknowledge that the Swift types,Swift.Array
andSwift.Vector
, are
complete opposites of the C++ ones,std::vector
andstd::array
. While it may
be confusing at first, ultimately we feel that our names are more in line with
the mathematical term of art.
If there was any type we could add to the standard library whose name could be
Vector
, it must be this one.
No, this is not the mathematical term. A vector is an object that may be added together and multiply by scalars. It is used extensively in computer graphics and machine learning, to name but a few, and to point out that it is not an unrelated term in programming. Sure, the proposed type can be used as a primitive to support those operations, but that is not what this proposal is about, nor the intention for the general use of this container. The proposed name doesn't help to disambiguate meanings, and this indirectly affects productivity.
I am concerned that naming this type Vector
will start us off on the wrong foot when we try to introduce a LinearAlgebra library into the ecosystem. This is what happened in C++, and it took them so many years to come up with a proposal for this (I hope it doesn't take Swift a decade to get this into the standard library). In fact, I'd like to share the reference for the C++26 proposal of the addition of a Linear Algebra library, which may help to understand that Vector
is just not the right name in the long term:
many C++ programmers incorrectly use the term vector as a synonym for
âdynamically re-sizable array.
â This bad habit is reinforced by the unfortunate naming of
std::vector.
No, this is not the mathematical term. A vector is an object that may be added together and multiply by scalars. It is used extensively in computer graphics and machine learning, to name but a few, and to point out that it is not an unrelated term in programming. Sure, the proposed type can be used as a primitive to support those operations, but that is not what this proposal is about, nor the intention for the general use of this container. The proposed name doesn't help to disambiguate meanings, and this indirectly affects productivity.
I'm surprised by the machine learning example as I'd have argued that the term "vector" in machine learning is actually more aligned with this proposal's definition of vector than with the actual mathematical term. A vector does not only require addition and scalar multiplication (technically a group operation and a field), but also that the results of those operations remain in the same vector space. Yet there are multiple examples in machine learning where this is not true.
For example the input of a machine learning model, sometimes called the feature vector, does not fulfill the mathematical definition of a vector, because the feature space is not a vector space. Adding two feature vectors together (or multiplying one of them by an scalar) does not necessarily result in a valid feature vector (the result may not be contained in the feature space), which would be a must have requirement to properly consider it a vector. So it's more like a box of numbers that supports element-wise addition and multiplication.
Linear algebra studies vector spaces as you have defined them, and there is basic linear algebra in ML, but of course it is not just linear algebra. It's certainly more than that. Another common misconception is to confuse vectors with "points". In some scenarios, there exist a mapping that can take those "points" into a vector space (think of a plane that does not cross the origin). AFAIK this happens in so-called "Kernel Methods", and the feature space is not just a vector space, it has even an inner product.
I do not want to hijack the thread as this is not a mathematical discussion, so I will just leave it with this comment. Besides, I am far from being an expert in machine learning.
I am happy about the prospect of the addition of this new type to the standard library!
Would you consider providing a way to create a
Vector
parallel to theArray(unsafeUnitializedCapacity:initializingWith:)
initializer?
Yes, but we'd like to do it with a safer, non-escapable type such as OutputSpan
(prototyped in swift-foundation .) While this doesn't address the out-of-order initialization need, OutputSpan
and other types in the Span
family will themselves provide unsafe access to Vector
's underlying storage. Hopefully, we won't have to add an withUnsafeBufferPointer()
methods to types such as Vector
that can vend a Span
.
Can the Count
and/or Element
be inferred?
let cArray: Vector<2, _> = [x, y]
Should string literals also be supported?
let cString: Vector = "\r\n"
// cString.count == 3
// cString[0] == UInt8(ascii: "\r")
// cString[1] == UInt8(ascii: "\n")
// cString[2] == UInt8(ascii: "\0")
Can the
Count
and/orElement
be inferred?let cArray: Vector<2, _> = [x, y]
Yes! Both of these are able to be inferred either by using the placeholder _
or by omitting the generics entirely, Vector
.
Should string literals also be supported?
let cString: Vector = "\r\n" // cString.count == 3 // cString[0] == UInt8(ascii: "\r") // cString[1] == UInt8(ascii: "\n") // cString[2] == UInt8(ascii: "\0")
Maybe with some more compiler hackery it could, but generally speaking Vector
could not conform to ExpressibleByStringLiteral
because the supplied string instance may have more or less bytes/scalars/characters than the initialized Vector
's Count
.
Would it be appropriate for the Swift compiler to implicitly convert from
Vector
toUnsafe{Mutable}Pointer<Element>
, but only when calling into C? I think the alternative â using existing APIs â iswithUnsafe{Mutable}Bytes(of:_:)
andwithMemoryRebound(to:_:)
for each vector argument.
It could, but if I remember correctly @glessard has been looking into reducing this implicit conversion. I think we want to go into a direction of being more explicit around these APIs because it's very easy to see a vector being passed a regular parameter but it's actually being converted to a pointer and passed to a C function which may or may not hold onto that pointer longer than it should.
Should vectors have explicit conversions to and from tuples? There might be users that need vectors and tuples, when migrating to the upcoming feature (without breaking ABI or source compatibility).
Yes, this should be possible sometime in the future. It would be pretty trivial with a couple more language features constraining parameter pack types:
extension Vector where Element: ~Copyable {
public init<each T>(
_ tuple: (repeat each T)
) where count(each T) == Count, T == Element {
...
}
}
Going to a tuple seems a bit more trickier
If the standard library were to define a type named
Vector
, then that would cause a source break in any module that uses another moduleâsVector
type.
This is untrue! I described in the source compatibility section that the standard library types are already shadowed from user modules. Any modules currently defining a name Vector
will continue to work as normal as well as clients importing said modules!
I think we should also consider having a heap-allocated fixed-size array type as well.
Yes! However, we'll save that for a potential future proposal
Surely, both
swapAt
s needs to be a mutating method, not a borrowing one.
Oops, yes you are correct. I will correct the proposal.
I'm confused about how the
consume
method is meant to work. The documentation says it "will call the closureCount
times", but only returns oneResult
, which doesn't make any sense.
You are correct, this doesn't make any sense. I will remove the result value.
The proposal includes a
swapAt
function that updates an element and returns the old one. This method does not exist onArray
and instead we have a globalswap
function to do it. Is this best thought of as a "swap" operation? I hate to be bringing up new naming concerns, but since it doesn't exchange the contents of two objects, I was wondering if there are alternatives considered.replace(at:with:)
? There has to be more.
The
swapAt(_:with:)
variant that takes one index doesn't do anything that the globalswap
andexchange
functions can't do, and doesn't need to exist.
Right. I included the swapAt
that takes two indices because that appears on MutableCollection
and thought there should be an exchange
equivalent, but if we can not include that and just go through the global exchange
that'd be even better.
I didn't see it mentioned explicitly, but can you confirm whether global/static data will then also be represented as constant data in the binary if the values themselves are literals and that no other initialization cost would occur when accessing it?
Yes, that is a major motivation for this type as well! I expect we will need to teach the compiler to lower these things into constant memory, but the result should be the same as if you were using a tuple. There wouldn't be any initialization necessary unless the element type isn't static/constant.
What's the feature flag this is behind? and is a nightly build the correct thing to try it with?
There's currently no feature flag yet for this as this hasn't landed yet (at least experimentally).
As far as I can tell, this proposal never even mentions copy-on-write, let alone discusses it as a difference between
Array
andVector
. That's not great because it's an enormous difference with major performance impact when you useVector
with a copyable type.
Yeah, this is a pretty big performance difference, but the behavior of the types should feel the same to Swift developers because they are both value types. Mutating an Array
only mutates your copy of it, mutating a Vector
will only mutate your unique instance/copy. It is probably worth calling out in the proposal however about the performance differences however because you're right, copying a vector and assigning it to a new variable, etc. will incur an immediate full copy.
I'd like to see a use site example for each API you propose. It doesn't have to be in the same placeâin fact, it might be best to put all of these examples in "Proposed solution" and leave the definitions until "Detailed design"âbut if we're to stay true to our ABI guidelines, we need to be able to evaluate clarity at the point of use.
Fair! I will update the proposal with usages of all the proposed API on it (excluding the generalized Collection APIs because those should feel very familiar).
This is pretty weirdâweird enough to make me wonder whether we should support the array literal syntax for this type. Are there other viable options, like overloading the N-tuple syntax or using a parameter-pack-taking
Vector.init(_:)
?
I feel it'd be weirder to not support the array literal syntax for this type no? I mentioned in a previous comment that we could write a parameter pack taking vector init, but that would require some new constraints on parameter pack generics.
Is there a wayâeven a failable wayâto convert an
Array
to aVector
other than by using the closure-based initializer?
There could be a Sequence
taking initializer that throws an error when the count of the sequence is less than the count of the vector. What do we do with the leftover elements? I assume we just ignore those and initialize the first Count
number of elements within a sequence to the vector?
I'm imagining a call site for this initializer and it seemsâŠstrange.
let evens = Vector<8, Int> { i in i * 2 }
Would it be possible for this to have a "size-pinning parameter" in the same way that many type conversion APIs have a "type-pinning parameter"? If it is possible, do we think that would be a good idea?
// Feels a little more comfy if we can pull it off let evens = Vector(count: 8) { i in i * 2 }
Not currently with integer generics, no. The generic parameter Count
must either be an integer literal, inferred, or some other integer generic. This functionality may included when we get existential integer generics however
ErâŠ
body
is calledCount
times, right? So which of theCount
Result
s is returned by this method?
Yep, this result will be removed, thanks!
Experience has shown that until we provide these conformances, people will conform the type retroactively, and that will create a bit of a mess once we release them.
- What features are needed to be able to provide these conformances? Are they close enough that this doesn't really matter?
- What is our plan for types like
Optional
that currently support non-copyability but already conform to these protocols? Could we do the same thing forVector
, or would the engineering complexity be too high?
I'm unsure if we're lacking features, more so just that we need to generalize these protocols. Equatable
and Hashable
should be relatively easy to do. CustomStringConvertible
should also be really easy, but I know @lorentey might've had some thoughts about whether the shape of this protocol is really the right thing for noncopyable types.
I think the current plan is to generalize Optional
to support Equatable
and Hashable
for noncopyable wrapped types as soon as those are generalized as well. We could pretty easily do the same for Vector
I believe.
I think you're confusing yourself by trying to design something that would support both
Array
andVector
.Array
and other variable-sized types are already well-served byExpressibleByArrayLiteral
; the difference between that protocol andExpressibleByVectorLiteral
is precisely that the latter design permits the type to control the size.
Array
is well served with ExpressibleByArrayLiteral
, but there may be other types who do want a variable-sized initialization that doesn't allocate an array but instead uses a stack allocated local to do so. There is a desire to both allow types to control the size of the array literal being passed while also not allocating an array for the variable sized case as well.
copying a vector and assigning it to a new variable, etc. will incur an immediate full copy.
Will incur, or will the compiler try to elide at least clearly unnecessary copies in optimization passes?
I'd imagine this has been considered, but if this is worth calling out, has there been thought to this or a parallel type being unconditionally noncopyable, so users have to explicitly invoke an API in order to copy all or a subset of elements?
Will incur, or will the compiler try to elide at least clearly unnecessary copies in optimization passes?
The latter. The compiler will treat this type as if it were any other copyable type.
I'd imagine this has been considered, but if this is worth calling out, has there been thought to this or a parallel type being unconditionally noncopyable, so users have to explicitly invoke an API in order to copy all or a subset of elements?
We initially wanted this type to be unconditionally noncopyable, but it was brought up that this might be a bit overly ceremonial for vectors of trivial types. It would require enclosing types to be noncopyable and parameters to declare ownership (which is not a bad thing! just maybe a bit too much for a couple of integers) among other things. The desire for an explicit copy is there, but I wonder if instead of a separate type we were to be explicit about all the ways the compiler will implicitly copy your type as well as ways it can be explicitly copied through source
It would require enclosing types to be noncopyable and parameters to declare ownership (which is not a bad thing! just maybe a bit too much for a couple of integers) among other things. The desire for an explicit copy is there, but I wonder if instead of a separate type we were to be explicit about all the ways the compiler will implicitly copy your type as well as ways it can be explicitly copied through source
RIght, part of what I'm wondering is if there are sufficient differences between, say, Vector<1024, URL>
and "a couple of integers" that it's worth making the latter something like SmallVector<2, Int>
; where smallness is in the control of the user and means "small enough that you want to copy it implicitly without the benefit of CoW." (This isn't new of course, even to Swift: we already have "smol"-ness as an internal concept for String
.)
Naively, I feel that a user who is advanced enough to opt into the 'less flexible' Vector
over the currency type Array
will many times actively care about being explicit about whether their aggregate is something that's implicitly copied around or not.
Perhaps âVector
â should be as pitched, and âFixedSizeArray
â should be a COW wrapper around itâŠ
I wonder if this would be great for embedded?
Yes absolutely, one of the factors for pitching this was the exploration we did for embedded. There are many places where a non-heap allocating fixed sized collection is quite useful for building firmwares or kernels etc.
Per your question about where it lives: I believe the behavior will end up being that if the value is global and a let it will be able to be emitted directly to the read-only memory section. When writing boot loaders specifically it has a number of uses that can help avoid writing assembly by hand (specifically the interrupt/exception vector... and that nod of the name as prior art imho lends to the naming chosen quite nicely).
"Vector" is an extremely well-established name for an ordered sequence of precisely n items, all of the same type. There is a particularly strong expectation that a type called "vector" would provide an array-like subscript operation. There is an equally remarkable lack of any expectations of resizability: when I see a type called vector, I would not expect it to come with any
insert
,append
orremove
operations. The name also rightly discourages operations like concatenation.
I agree that these are well-established and true. But to me, the problem arises with any binary operation on this type: There is obviously an equally well-established expectation that vectors conform toâwellâvector arithmetic.
The pitched type would be super useful to represent all kinds of vectors from mathematical and physical domains, such as euclidean vectors, complex numbers, matrices, polynomials, (polar) coordinates, and more.
And for each of these, you'd expect them to be non-growable and indexable by subscript, as you suggest. And, they are all well-known as vectors!
But, crucially, you'd also expect +
to not be concatenation, but the appropriate vector space addition operation.
I realize that there are also a huge number of people who do expect the type described in this pitch to support concatenation, but the fact that these expectations are as incompatible are they are polarizing, suggests that the name probably isn't that great after all.
Do we need a fixed-size buffer type? Yes, I think so, especially for C interop. Should they be named Vector
? Sorry, but that seems to me to be the wrong choice:
- There is existing usage of the term to mean resizable buffers in other high-profile related languages; this existing usage is sufficient to cause confusion for developers moving to Swift from those languages.
- There is existing usage of the term for types that represent mathematical vectors with vector arithmetic support (
vec3
et al.) and these types have widespread adoption.
FixedArray
, FixedSizeArray
, SafeBuffer
, or even just Buffer
might be better names for this type.
(I don't think anything I've said here is novel and other comments have covered all of itâjust adding my two cents.)
It sounds like what you are actually defining is a tuple, which is a superset of a vector.
The intended use cases might be to use this as a vector, but it could also be used as a quaternion, or a matrix, or even some other multi-dimensional list of things which are not numbers (which makes this definitely not a true vector).
From a programmer's point of view, I might be thinking of a Tuple<4> as a quaternion, while at the same time I have a Tuple<4> which refers to a 3d homogenous coordinate, and have yet another Tuple<4> which defines the coordinates of an axis aligned bounding box. Only the 3d coordinate is conceivably vector in the technical sense, providing direction and magnitude.
Perhaps it would be sensible to extend existing Tuples where all of the elements are of the same type, or provide a new homogenous tuple which provides the proposed capabilities. The fact that C interop already supports tuples hints that what we need is an extended tuple instead of a more limited array?