[Second review] SE-0453: Vector, a fixed size array

I do not favor Vector for a number of reasons.

"Array" and "Vector" are swapped in many languages we care about

This has been discussed to death in the context of C++ interop, so I won't belabor it too much. I will simply add that it's not only C++ that has this problem—it's also Rust (which, I think it's fair to say, is a language we're jockeying for space with) and Java (which also has an active interop project—though with the caveat that, like many of Java's early collection classes, Vector is underused because of its synchronization overhead).

While there have certainly been languages that used Vector for a general-purpose (i.e. non-SIMD) fixed-size array type, I don't think any of them are particularly relevant to Swift. All the ones which are most relevant either don't use the name "vector" for a general-purpose array type, or use it for a variable-sized one.

"Vector" sounds like geometry to most people

Much has been made of the similarities between the Vector type and mathematical vectors, but that cuts both ways. In the math education systems I'm familiar with, vectors are first introduced in the form of Euclidean vectors when studying geometry, and only during the study of linear algebra are they generalized into something resembling the Vector type being proposed here. To anyone who has studied the first concept but not the second, the name Vector is not helpful—it is actively confusing.

Now, most people who have a CS, math, or other STEM degree will probably study linear algebra, and thus at least be exposed to this more general concept of a vector. But most people period study the kind of basic geometry that involves Euclidean vectors. In the US, 92% of high school graduates have completed a geometry class. Given that only 61% of US high school graduates even enroll in a college—let alone complete enough high-level math courses to get to linear algebra—it's likely that most people will only have been exposed to the geometric version of a vector.

(And even those who have been exposed to the concept have not necessarily internalized it. I have a CS degree and studied some linear algebra in the process of getting it, but a decade of UI programming with Apple frameworks hammered in the geometric meaning of "vector" enough that the use of Vector proposed here seems distinctly odd. In conversation with my father—a retired programmer with a CS degree and a career spent in database-heavy business programming—he also said that "vector" felt like a geometric term to him; he was familiar with languages that used it for an array type but felt it confusing there.)

I'm not comfortable privileging highly-educated specialists over generalists in the naming of a currency type. We must be careful to consider not only the needs of stereotypical tech company engineers and STEM students, but also those of secondary students, non-STEM tertiary students, adult learners, engineers with nontraditional backgrounds, etc. Swift aspires to be an accessible language—let's pick a name that's more broadly understandable.

The "Array means resizing, therefore this isn't an array" argument begs the question

I've seen statements like this a few times:

These make it sound like there is some pre-existing concept of an "array" that excludes vectors. But unless I'm mistaken, this definition of a Swift "array" as requiring resizability has never been presented or established before this proposal. And when I look outside of Swift at programming languages generally, the criteria of an "array" seem to be much broader, requiring only¹:

  • Indexing by a continuous range of integers
  • Arbitrary positioning of elements (unlike e.g. a set that controls the position of each element)
  • Iteration in position order
  • O(1) access and replacement of elements in the common case
  • Homogeneous element type, to the extent the language's type system supports the concept

¹ There are languages where arrays don't necessarily always meet these criteria—usually dynamic languages which are trying to implement arrays using a dictionary-like object and which therefore struggle to keep indices contiguous or to iterate in position order—but these deviations seem to usually be regarded as quirks, pitfalls, or even bugs, not desirable features.

In particular, there are many languages—most notably C and many of its derivatives, but also including other influential languages like Fortran and Pascal—where the built-in "array" type is fixed-size.

So when you say that arrays are variable-sized by definition, you are using a definition that has no precedent in either Swift or programming languages generally. It's something new that is being established during this proposal.

That is not to say that this definition is bad or wrong—indeed, I can see how it might be useful to establish a rule that everything called an "array" is variable-sized—but that arguing as though this is a limitation we must conform to is overstating the case. The definition of Swift array types as resizable is itself under review here.

So where does that leave me?

Although Vector is not so bad that I'm willing to die on the hill of stopping it, I would prefer a different name. I am not married to a specific alternative; here are my thoughts on a number of them:

  • FixedArray or another short compound name with Array: If we want to reject the proposed array definition, then a short compound name with Array at the end seems ideal.
    • Arguments along the lines of "that means regular arrays are broken!" feel a bit like sophistry to me—pithy names are always ambiguous; they just need to be sufficiently evocative and memorable.
    • Note that the integer in the generic parameter list will also help clarify what exactly is "fixed" about a fixed array.
  • FixedSizeArray or other long compound names: These are getting too wordy. A complete description is not a name.
  • Other single-word names for groups: None of these will be immediately obvious since they're new coinages, but they're all learnable and many read well enough once you've learned them.
    • Of the ones mentioned earlier, my favorite is Slab—it's a word already associated with memory allocations and the connotations of rigidity and flatness are appropriate for a fixed-size one-dimensional array.
  • Standalone adjectives/verbs (Multiple, Repeat) or compounds with Of: I don't think we want to establish these as styles we would use for currency types.

I also have two (I believe) new suggestions:

  • Series: This is meant in the colloquial sense of "TV series", "championship series", or the "series" of a chart (several related things kept in order), rather than the mathematical sense of an infinite sum.
  • SpanStorage or similar: Conceptually similar to FixedArray, but using Span rather than Array as the point of comparison.
29 Likes

“What's in a name? That which we call a rose by any other name would smell just as sweet.”

William Shakespeare

4 Likes

Although I generally think Vector sounds right, I think that point deserves some serious consideration:
If Vector is the wrong choice for vector math, the name would be quite unfortunate, wouldn't it?
We should not repeat errors of the past (honestly, how many people in the future will learn C++ before learning Swift?), but we should care for the future, and I hope some day we will have Matrix as well.

So if we could add some special treatment for Vector<n, Float> to make it suitable for calculations and interoperability between libraries, that would be great — but otherwise, I'd prefer a different name.

2 Likes

I think it'd also be unusual (though not unprecedented) to have both the growable and the non-growable types be named after the same word. Most languages either have a array and a vector type, or don't support a growable type at all (typically using array then). To me, that's a sign that these are generally considered two separate abstractions across CS.

It may not have been established explicitly that in Swift an array is a growable collection type, but that's been the de facto for the last 10 years. Since most languages make a distinction between types that have fixed vs dynamic sizes, it's reasonable to say that, to date, arrays in Swift are growable collection types.

(Perhaps worth noting, Python burned not one but two different names (List and array) in dynamically-sized types).

My biggest concern with this is, precisely, about how unapproachable the concept of "arrays" would become for newcomers if the requirement of supporting dynamic length is lifted. A lot of things can be arrays under that definitions, so much that I think it may not be as useful of a concept to have (is simd_float3 an array?).

I moved into Swift knowing just a bit of C and Python, and I remember that my mental model of Swift's arrays at the time wasn't "oh, these are like heap-allocated, growable C arrays", not even "these are like growable numpy arrays". To me, the mental model for Swift's Array was "ah, this is like a Python list!".

My experience may not be universal, but I honestly think that for a newcomer that is just starting to understand the basic data structures used in programming, whether or not a structure can be resized is a fundamental, defining property of that data structure. Just as much as ordered vs unordered, homogeneous vs heterogeneous or mutable vs immutable. Even more so for a structure as basic as Array.

Because of the above, my biggest hope for this proposal is that the final name ends up being anything other than something with "array" in the name. I think it would needlessly muddy the waters after a decade of having a clear model of what an array is.

For what it's worth, I also liked Slab the most of the non-Vector names. I don't think it's as intuitive as Vector, and I think less people will have an intuition of what it does compared to Vector, but at least it's its own separate thing.


My second biggest hope is that it doesn't end up being a three word name. I think there's been enough people in this thread alone believing this will be a currency type to say confidently that regardless of whether or not it'll become a currency type for libraries or not, it for sure will be an "everyday" type for a lot of people. Crippling those use cases with a long unwieldy name feels a bit hostile.

7 Likes

Swift has had a currency type named by a standalone adjective since version 1.0: Optional.

2 Likes

As a counterexample, JavaScript has Array which lines up well with Swift’s array (leaving aside weird JS quirks). Then there are the various TypedArray subtypes like Uint8Array which are like a cross between Span<UInt8> and Vector<n, UInt8>. They cannot themselves be resized (although, again, weird JS quirks).

True, but Optional is a very unusual type in many respects—particularly in how thoroughly it's special-cased and how many of a user's interactions with it are through language features rather than API surface. Even conceptually similar types like Result do not use the same naming convention, let alone the other collection-like containers.

1 Like

I might be more amenable to this argument if this thread hadn't already suggested a whole family of Array types that include non-resizable versions. Even if they weren't, the argument for resizability as the single, fundamental property of the thing we call 'array' hasn't been made. In fact, outside of Swift, that's generally not the case. The properties Becca outlines are (more) universal, as much as something can be across all programming languages.

Even if none of this were true, I think this is a rather strange notion of 'newcomers'. Newcomers don't usually start by trying to distill the general, undocumented properties of the types they use. They try to find the things they need, use them, then maybe remember what they were for next time. Anyone trying to form some sort of generalized notion of the properties of collection types in a language is likely already flexible to understand non resizable types with 'array' in the name.

Regardless of what is decided at the end of this review, or future review after some filtering, I'm grateful for the many insightful takes and concerns on API construction and the etymological "discoveries" within the language and field. This feature/structure is long standing in CS and Swift is just barely getting it, so we have a lot to consider. Swift's naming scheme has evolved over the decade+ with succinct names like Int, adjectives like Optional, compound names like ThrowingTaskGroup, and more where each has their own justification, either technical or purely subjective.

With that, I believe that whatever name is chosen will be a nuisance to some and hopefully not that controversial (at least within these forums). Whatever name is chosen will be considered "baggage", "wrong/right", or "makes the most/least sense within CS and across multiple fields and this specific language". There have been counterpoints against counterpoints that have even more counterpoints. I have been personally persuaded starting from Vector, to FixedArray, to InlineArray, and back to Vector again all with their own justifications.


After 300+ posts on naming alone, my opinion has been reduced to the following:

1 - I want Vector due to reasons, but I'll live.
2 - Please just pick a name the language feature is so important this should have been implemented so long ago.
3 - Make everybody [un]happy and typealias everything suggested ;)

5 Likes

Both “multiple” and “repeat” are nouns as well as adjectives, and while the latter isn’t really viable given that it already clashes with various other uses—regex builders, I believe, and also the type of repeatElement(_:count:)—I have a hard time finding something to criticize about Multiple if we can’t agree on Vector.

Not only is it difficult to say something bad about it, but it also has the salutary (in my view) property of sharing a Latin root with “tuple” but not being straight-up twinsies, and it also evokes the possible sugar syntax that we’re pondering—imagine the diagnostic text: “[4 * Int] aka Multiple<4, Int>”. It is a term everyone has heard of regardless of their proficiency in computer science or math, not actively misleading in salient confusable ways, and—mercifully—not just plain odd (to my eyes anyway). Ah, and its use has some precedent in the “M” in SIMD.

4 Likes

It's likely the right choice for some sorts of vector math (at least as a backing type), but it's not the right choice for LAPACK/BLAS API.

That's OK. std::vector also isn't right for them. Neither is Rust's Vec, nor what pretty much any other language calls a "vector". The LAPACK/BLAS API patterns are pretty specialized, and are not a good match for basic "vector" types in most languages.

7 Likes

Why can't [Int x T]or its variants [Int, T], [Int T]do the job?

After all, both C and C++ have something similar:

const long code [4] = {2, 3, 5, 7};
const long identity [3][3] = {{1, 0, 0}, {0, 1, 0}, {0, 0, 1}};

It can be called or talked about as a C-like array.

Come on, let's do this. :slight_smile:

The specific senses of "multiple" we would like to invoke here ("having or involving several parts, elements, or members") is only an adjective, not a noun.

Arguably "repeat" should not have been lumped into that bullet point. On closer consideration, I think it simply has the wrong meaning—it typically refers to identical things. That describes repeatElement(_:count:)'s result very well, and this new type rather poorly.

2 Likes

Ah yes. Adjective it is then!

Feels appropriate in being of-a-kind with Optional in a way that other types are not: in the case of Optional, we’re representing either one of something or not, and here we’re representing exactly n of something.

I love @davedelong observations on the challenge of this type name. I would add that using this name would immediately create a mess for plenty of people that have used the word Vector in their swift types:

That’s a whooping 102k matches. The months worth of debates and tech support over how to deal with this is in nobody’s interest.

Swift doesn’t need something else to be made fun of, when a perfectly viable name like FixedArray (830 GitHub matches) or other variations exists.

And now, I go back to my cave.

12 Likes

I think the use cases of this type will lead to the right name. I’ve seen a few different use cases:

A “Math” Vector
Using this type like a math vector means that it is viewed as a set of components. In this use case, the Element is typically some sort of value that can be added and scaled. A math vector may conditionally define a component-wise addition operation or (at an extreme) may require something like Element: Scalar.

Within math vectors, I see two subsets of use cases.

  • First off is the use case of small, constant sized sets of the same type. An example of this is a Color or a CGVector. I don’t think the proposed type will be particularly useful for this.
  • Second is the use case for things like machine learning where you need to deal with variously sized vectors and matrices. I think this is a reasonable use case and think this should use types like Vector<count> and Matrix<rows, cols>. Neither of these are the proposed type, but I think it makes sense to implement those types using the proposed type and that the Vector name is better reserved for this higher type. To implement these types I would want something like a Buffer.

A Constant Sized “Collection”
Using the proposed type like an ordered collection seems to be much more in line with my reading the proposal. With this use case the type is viewed as a set of elements (rather than components) and if there were to be an addition operator defined it would be concatenation of the two collections (although it should not be defined). This view is also where iteration though the elements is emphasized. The distinctive aspect of the specific proposed type is the (arbitrary) constant size, something Swift can not yet enforce.

This interpretation lends itself to the suggestions for names like FixedArray, although it seems like there are plans in the works for names ending in Array where Array means something like “variable count O(1) Int indexable collection”, which this type is not. That definition doesn’t seems like one possible definition but there are others that may include the proposed type. If not anything -Array, Buffer seems like a reasonable name.

The collection view also leads to the tuple related name suggestions, which I find less convincing. Something like MonoTuple<3, Int> is either the same or different than (Int, Int, Int) and I think either way is confusing to the user.

Interop with C Arrays
This use case is using the proposed type instead of tuples like (Int, Int, Int). I believe that a named type is better than a syntax like (1024 x Int) and continuing to use tuples because these tuples do not have (and likely will not have soon) the same abilities a named type has (like methods). I think that Vector, FixedArray, Buffer, and others are all acceptable names for this use case.

My Vote - Buffer
Overall, my suggestion would be Buffer or something Buffer related. If I recall correctly, Swift sometimes uses buffer (e.g. UnsafeBufferPointer) to mean a similar thing except elements may be uninitialized. This inspires InitializedBuffer, which I think is a bit wordy but overall good. I would rather Buffer mean initialized and UnsafeBuffer mean potentially uninitialized.

There being an UnsafeBuffer and Buffer seemed weird to me at first, but I think they really do represent different things. This proposal is for the safe Buffer and I think that’s a good starting point. An UnsafeBuffer seems like it would be useful for a potential InlineArray as described above.

There’s the followup question of if a C int[100] is an UnsafeBuffer or a Buffer, which I think should be resolved with an annotation on the C value, defaulting to one of them.

I also like the SpanStorage suggestion or similar since we’ve already settled on Span being a new fundamental term in Swift.

1 Like

We have not called Span an array type; we have not called UnsafeBufferPointer an array type. Not in name, nor in documentation. We only ever called something an “array” if it provides range-replaceable operations.

The proposed new type belongs in this type family. It is not like an Array — it is much less than that.

My point is that we’re on the verge of proposing a number of new array types, all providing the same core operations. I think it’s important to have a word that identifies this family; and “array” is the obvious choice for it.

Randomly applying the same suffix on a type that does not belong in this same family would be incredibly confusing.

I would be deeply disappointed if y’all ended up calling this FixedArray, or any variant of that.

If you aren’t swayed by the arguments for the name Vector, that is fine! Choose any of the alternative names; or invent a new one! I offered Pack. Series would be a fine choice as well.

7 Likes

I might be more amenable to this argument if this thread hadn't already suggested a whole family of Array types that include non-resizable versions.

I have good news! All of the new array types that we’re drafting come with range-replacing operations like append or remove. They are all resizable: their count can change at runtime.

Some of them have a specific maximum storage capacity. (Some of them include this as part of the type.) But all of them can be initialized as an empty array, with elements added or removed as needed.

3 Likes

It only seems confusing if you solely consider that the only source of names that people draw from when understanding things is the Swift documentation and types. And not only that, but that they fully internalise an unstated definition by the absence of behaviours of items in that set.

It hardly seems "random" when viewed in the light of other languages and common usage, as @beccadax pointed out. It's clear that for many in this thread (myself included) that FixedArray is a very good name -- which indicates to me that this meaning that some are prescribing to Array isn't as widespread or obvious as they might think. It's attached to this thread title to clarify what the type is, even.

5 Likes

What is an "array"?

What is an "Array"?

1 Like