Vector, a fixed-size array

Maybe in the broadest sense, but when talking about mathematical vectors people usually mean something more along the lines of the Swift Standard Library's SIMD Vector Types, which I would have personally liked to have been named something like Vector<Width, Scalar> instead of SIMD2/3/4/8/16/32/64<Scalar>.

https://developer.apple.com/documentation/swift/simd-vector-types

Not a member of the core team, but we do already have these SIMD Vector Types in the Swift Standard Library, but no matrices or quaternions as of yet.

https://github.com/swiftlang/swift-evolution/blob/main/proposals/0229-simd.md

https://github.com/swiftlang/swift-evolution/blob/main/proposals/0251-simd-additions.md

A vector itself is just a piece of information, formatted in a certain way. Operations on vectors are not part of the vector itself. So a vector space is a set (of vectors) along with some operations (addition, scalar multiplication etc).

So it makes total mathematical sense to define a generic vector type that can’t necessarily always be used in a vector space. For example; one my conditionally make it Addable when its scalar type is.

So i can’t see any mathematical objections at all. As to how other languages chose to name it in the 70s, I don’t think that should be taken into consideration at all. I am sure that C++ developers will manage, they are smart people!

5 Likes

C++ itself doesn’t date back to the 70s, and the STL is even younger. And as @Andropov pointed out, Rust uses Vec for its growable container too.

The mathematical arguments have already been thoroughly debated. We don’t need to rehash them.

1 Like

Ok but still, using vec or vector for something that can change length is very dubious, I don’t think we should perpetuate that mistake. The whole idea with swift was to have a fresh start, wasn’t it?

1 Like

I don't think anyone has argued in favour of that...

1 Like

Nobody is suggesting to rename Array (Swift’s growable container) to Vec. Swift does, however, have native interop with C++, which uses std::vector frequently at API boundaries.

1 Like

Chiming in on this, I've been experimentally adopting «this new type» in an embedded swift code base to avoid Arrays. My conclusion is that the name probably wont matter much because every single one of these has been wrapped in another stuct e.g.

struct StrongType: ... {
  typealias Storage = Vector<54, UInt8>
  var storage = Storage

  ...
}

I am almost exclusively referring to StrongType or StrongType.Storage and only generic code is operating on Vector.

In this example I think both Vector and Inline are totally fine names.

I think the fact that some API will need to traffic in Vector itself makes it a better name. Specifically I think func x(_: Vector<...>) -> Vector<...> reads better than func x(_: Inline<...>) -> Inline<...>.

10 Likes

Sorry, i meant indirectly, by not naming this clear vector type ”vector”, based on the fact that some other languages misuse that name


My counterpoint is that it shouldn’t read better because you probably shouldn’t be doing this without a very good reason. Returning a Vector<N, T> almost certainly involves spilling nearly N instances of T to the stack. If your program or library cares enough to adopt Vector, it will almost always be a better idea for such a function to take an uninitialized UnsafeMutableBufferPointer<T> (or Span<T> in the future?) and avoid the temporary copy to the stack.

1 Like

SIMD arithmetic has all sorts of operations that do not make sense on general vector spaces (e.g. elementwise multiplication, or shifts, or division, or horizontal reductions, or saturating rounding doubling multiply-add. We very deliberately did not name the SIMD types Vector for exactly this reason.

(If we had had integer generic parameters when we were designing SIMD, I would have used them, so we would have SIMD<2, Int32>, but we would very definitely have not have called them Vector.)

8 Likes

Vectors would be modeled by a VectorSpace protocol, rather than a concrete or generic type, if we wanted to make them "conceptually correct" in the stdlib. A vector space is a set together with an addition and multiplication that satisfy certain axioms, not a specific represesntation, something like:

protocol VectorSpace {
  associatedtype Scalar: ScalarField
  static var zero: Self
  static func +(a: Self, b: Self) -> Self
  static func *(a: Scalar, b: Self) -> Self
  // everything else can be defaulted in terms of the above
}

as such, I do not believe that any of the names proposed here create a problem for progress in such a direction.

7 Likes

Ok, well thank you for addressing my question. It sounds like you're mostly certain as opposed to entirely certain that the name Vector wouldn't be needed in such code. To me, the fact that it does sound like it is a real possibility that we will eventually model such things in the standard library makes me continue to feel uneasy about using Vector here, but now that I know that my concern has been registered I feel at least a little better since I do trust you to be prudent about these things. I'll rest my case :pray:

3 Likes

Although you did get quite close to doing so:

Old name New name
Vector2<Scalar> SIMD2<Scalar>
Vector3<Scalar> SIMD3<Scalar>
Vector4<Scalar> SIMD4<Scalar>
Vector8<Scalar> SIMD8<Scalar>
Vector16<Scalar> SIMD16<Scalar>
Vector32<Scalar> SIMD32<Scalar>
Vector64<Scalar> SIMD64<Scalar>
1 Like

There has been an important concern raised about Vector's conditional Sequence and Collection conformances: its iterator and subsequence types will need to make a full copy of the vector every time they are created:

let items: Vector = [1, 2, 3, ..., 1_000_000]

func foo() {
  for item in vector { // Implicit copy of a million integers
    ...
  }
}

func bar() {
  var slice = items[...] // copy of a million integers
  while let value = slice.first {
    ...
    slice = slice.dropFirst() // copy of a million integers
  }
}

For large vectors, this has the potential to become a major performance footgun.

Collection requires its slicing subscript to have O(1) complexity; Vector's implementation does technically satisfy this -- it copies a constant number of items! -- however, this observation merely highlights the limits of big-o notation; it will provide precious little solace to folks who have loops like in bar above.

If large vectors are going to be pervasive, then the only remedy I can see is to remove Vector's conditional Sequence and Collection conformances, losing our ability to pass vectors to functions generic over these protocols, and the ability to use classic for-in loops to iterate over them.

Borrowable container protocols will eventually give us borrowing for-in loops, but we aren't ready to propose those at this time.

I'm a bit disappointed and frustrated by this, as I expected Vector to become the ideal type for storing bulk data in static storage, like the static let in the example above -- but its copyability makes that use case a clear performance trap. The only way out would be to relegate Vector to the tiny cases, giving up on its use for bulk storage.

12 Likes

I kind of assumed that Span was going to be the recommended way to pass references to a Vector, but it’s been difficult to understand that since the Span and Vector proposals were split.

This is a bit of a knee-jerk reaciton, but perhaps this is evidence that a type like Vector is not the right solution to the kinds of problems it’s trying to solve? Perhaps what we really want is language support for tail allocations and a way to vend Spans of those.

3 Likes

Alternatively, once we have borrow variables, you should be able to pass a borrowed vector around. But you'd still have to be vigilant; it would be easy to accidentally make a copy.

let UnicodeDB: Vector<18_549, UnicodeData> = [...]

func useDB(scalar: Unicode.Scalar) {

  let offset = lookupOffset(scalar)
  let db     = UnicodeDB // oops, makes a copy!
  let data   = db[offset]

  let db     = borrow UnicodeDB // ok - no copy :)
  let data   = db[offset]
}

func dbUtility<let N: Int, Data>(_ db: borrowing Vector<N, Data>) {
  // Can't accidentally copy db

  let db2 = db // Error - implicit copies disabled, use the 'copy' operator.
}

Strictly speaking, per @Joe_Groff, “borrowing is orthogonal to whether the value is physically copied or passed by reference at the machine calling convention level.” Which would be bad for a Vector<n, Lock>! We might also need extension Vector: RawLayout where Element : RawLayout.

As the comment you linked said:

Values of types that either do have a significant address (such as C++ types, or weak references), or are larger than a given threshold (four pointers's size), or are of unknown size (such as unspecialized generics) are passed by address.

which should cover either huge vectors or vectors containing types with address-significant types like Lock.

How does the compiler know that a Vector<n, SomeRawLayoutType> is itself @_rawLayout?

The baseline assumption for a generic type we don't know anything more specific about is that it is potentially raw-layout, and so passed by address.

1 Like