[Second review] SE-0453: Vector, a fixed size array

But that's actually the issue.

Vector<3, Int> could be a lot of things. Could be a fixed-length index path in a fixed-depth tree structure. Vector<4, UInt8> could be a "four char code" or an IPv4 address. Vector<4, Float> could be a color.

This "vector" type should be a storage type, not a mathematical vector. And you build the mathematical vector, with its own set of operations, on top of it for use cases where you need mathematical vector operations. And similarly with the IP address, the four-char-code, the fixed-length index path, or the color: they should each be have their own type built on top of a fixed-size inline array storage type and define their own domain-specific operations.

If the standard library wants to add a vector type with vector math operations, that's fine by me. But if someone insists the base contiguous storage type should be used directly for vector math, I would argue there's already one vector type, dynamically sized, called Array and invite them to write vector math extensions for Array and see how much sense it makes semantically.

9 Likes

Computing has an extremely long tradition of using "vector" in a generalized way, meaning finite sequences of a fixed length. Stepanov did not pull the name std::vector out of thin air -- by the 90s, this word has been thoroughly generalized to cover more than just sequences of "numbers".

(What sort of numbers, by the way? If we insist on the classic vector space definition, there can be no such thing as a vector of integers.)

As is well-documented, Stepanov considers the name of std::vector to be a mistake, but not because it can contain things other than scalars -- it's a mistake because std::vector is dynamically resizing, and as such, it should've been called std::array. That name was later used to label their construct that models C arrays, i.e., actual vectors.

Here is a random collection of quotes I found on my bookshelf, of the term "vector" being used to name the idea of a sequence of elements of a specific size" . They're mostly from editions published in the 90s or later, but the convention goes waaay back. (Apologies, I'm much too young to have 1st editions of these.

"To model computer memory, we use a new kind of data structure called a vector . Abstractly, a vector is a compound data object whose individual elements can be accessed by means of an integer index in an amount of time that is independent of the index." -- H. Abelson, G. J. Sussman, Structure and Interpretation of Computer Programs (MIT Press, 2nd ed, 1996)

"Vectors, see Linear lists." -- D. E. Knuth, The Art of Computing Programming, vol 1. (Addison–Wesley, 3rd ed, 1997)

"The standard example is the type of lists of specified length, traditionally called vectors. We fix a parameter type A, and define a type family Vecn(A), for n : ℕ, generated by the following constructors:

  • a vector nil : Vec0(A) of length zero,
  • a function cons : ∏(n:ℕ)A → Vecn(A) → Vecsucc(n)(A).

In contrast to lists, vectors (with elements from a fixed type A) form a family of types indexed by their length."
-- The Univalent Foundations Program, Homotopy Type Theory: Univalent Foundations of Mathematics (2013)

"To write a total function first, we must use a more specific type constructor than List. This more specific type constructor is called Vec, which is short for "vector," but it is really just a list with a length.

An expression (Vec E k) is a type when E is a type and k is a Nat. The Nat gives the length of the list.
-- D. P. Friedman. D. T. Christiansen, The Little Typer (MIT Press, 2018)

My current library is particularly lacking in classic works on type theory or formal semantics, which is why my type theory quotes are so recent. For what it's worth, my own formal training was based on classic Floyd-Hoare logic, and I can remember plenty of use of "vector" that goes well beyond tuples-of-"numbers". I can find examples of Floyd using it in this sense even as early as the sixties.

"Boldface letters will designate vectors formed in the natural way from the entities designated by the corresponding nonboldface letters: for example, P represents (P1 P2, •• Pk).
-- Floyd, R. W., “Assigning meanings to programs,” Proceedings of Symposia in Applied Mathematics Vol. 19 (1967), pp. 19-32.

We've had generations of programmers growing up on (at least some of) these; I am very much one of them.

The use of the name Vector in the sense we're proposing is not a novel invention. It is not in any way a radical new abstraction. We are not breaking new ground. We are merely following a very much well-established tradition.

At the same time, it can also be true that words mean different things to different people. After all, no two people speak the same language; communication is lossy at best.

There are plenty of examples of people using "vector" to mean heterogeneous lists (as was often the case when people talk about "state vectors"), people talking about infinite vectors, people insisting that "vectors" are dynamically resizing arrays, that "vector" is a specialized SIMD construct, that "vector" is a list of numerical data, that "vector" is a direction and a length.

Names aren't context-free. They do not have universal meanings, just vibes. As humans, we have the capacity to make abstractions, to recognize patterns, to generalize preexisting concepts. We routinely make use of this in everything we do, and we are plenty capable of dealing with context and nuance.

Our job is to come up with a term for this new type. Above all else, the new name must not mess up the existing nomenclature of Swift -- words do have concrete meanings in this context, and we must not dilute that.

The proposed type will generally be used as an alternative for homogeneous tuples. But using the word Tuple in the type name would be confusing; that name is already taken to mean a specific construct in Swift, and while this new type is similar to that, it is not the same. We need to be able to clearly distinguish between the two, in writing as well as in oral communication.

Similarly, the new type is like a fixed-count array, but is certainly not an array type in the sense that we use that term in Swift. Our arrays come with resizing operations like append or remove; they can be concatenated to form new arrays, with the result having the same type as the input. This new type has no such operations. We have never used the word Array to name a type like that in Swift, and we should not start now. We're on the verge of proposing multiple new array variants; sticking to our established terminology is going to be crucial to avoid confusion.

Therefore, I am in vehement opposition to the idea of calling the new type a Tuple or an Array. In the context of Swift, this is not a tuple type. In the context of Swift, this is not an array.

We must use a name that's distinct from those.

We chose to propose the name Vector, as it was by far the most obvious choice. "Vector" is already in widespread use in a meaning that precisely matches our construct -- it usually denotes a finite sequence of items, with a fixed length that is part of the type. Vector obviously has the correct vibes, and there is ample of precedent of prior usage in our precursors. Even the ill-chosen name std::vector in C++ puts people in the right ballpark -- it names a linear container with integer indices. I therefore find the worry about "confusion" a bit of a stretch. We can also mitigate it by choosing to truncate Vector to Vect or Vec, like we've done with Int, Bool. (Yes, I'm well aware Rust calls its array type Vec. I am trying really hard to find a reason to care, to no avail.)

Some folks evidently believe that a type called "vector" must be a "mathematical" vector (whatever that means; definitions vary wildly). On the other hand, some folks appear to believe that "vector" must mean a dynamically resizing array type, or legions of C++ survivors will not feel adequately welcome to Swift. I do find these arguments tiresome -- it is well within our power to define what Vector should mean in Swift, and for where I'm standing, our proposed definition is clearly close enough to both of these meanings.

But as long as we do not try to mislabel this type as an "array" or "tuple", we still have plenty of options to name it, even if we discard "vector" -- we just need to pull something out of thin air that has the right vibes.

Collective nouns or terms for containers are great options. Our universally established computing terms for "array", "stack", "heap", "collection", "set" have already borrowed from this same space, with great results.

We've seen many suggestions for such names in this thread; I've curated a collection of highlights below:

@clayellis suggests Cluster, Batch, Bunch, Grouping, Group, Lot, Clump
@jberry suggests Slab
@Sajjon suggests Row, Rack, Cabinet, Cubby, Hive, PeaPod
@JanWillemBrands offers Bunch

I recommend Run, Spread, or Pack myself.

I do think that collective nouns generally work great for this purpose. Flock, Fleet, Pile, Wad, Bouquet would also all be lovely choices. I find expressions like "a bouquet of four integers" or a "pod of six booleans" or a "flock of doubles" rather charming, and it would tickle me to name this type like that.

In my everyday conversations, the word "pack" has recently gained some notable traction as a naming alternative. Pack<4, Int> looks and sounds great; however, it does have the drawback that SE-0393 has established the terms "parameter pack", "type pack", "pack expansion" etc. to mean related-but-different concepts. The argument I'm using to rule out Tuple therefore also applies to Pack -- however, I feel the association may be less strong in this case, and so there may be a chance we can use it anyway.

If we consider Pack to be already taken, then I offer Pod, from @Sajjon's "PeaPod":

struct Pod<let count: Int, Element: ~Copyable> { ... }
extension Pod: Copyable /* where Element: Copyable */ {}

var x: Pod = [1, 2, 3]
func foo(_ items: Pod<3, Float>) -> Int

Note that unlike Vector, none of these terms have any established use in this context -- we're conjuring these from thin air, and whatever name we invent will be equally foreign to all programmers, no matter their background. This makes them a lesser choice in my eyes, but if Vector is off the table, then these seem the least harmful choices. They are memorable, they roll off the tongue, their everyday meaning is a good fit with our proposed use, and they do not mess up Swift's preexisting terminology.

23 Likes

Slab, please no!

slab | slab |
noun
a large, thick, flat piece of stone, concrete, or wood, typically rectangular: paving slabs | she settled on a slab of rock.
• a large, thick slice or piece of cake, bread, chocolate, etc.: a slab of bread and cheese.
• Climbing a large, smooth, steep body of rock.
• an outer piece of timber sawn from a log.
• a table used for laying a body on in a morgue.

1 Like

FWIW, as someone initially skeptical of Vector, I must admit that @lorentey has IMO done an excellent job of convincing us that it's actually the correct name. So consider me convinced; +1 for Vector!

25 Likes

I find this whole thread fascinating.

I understand that ‘bike shedding’ must be frustrating for those whose day-to-day job is to shepherd the evolution of Swift. It must feel frustrating and like sand in the cogs of language implementation.

Don’t forget just how long these naming decisions will last for your users.

Don’t “over-index” (I hate that phrase) on C++. Isn’t the whole point of Swift and Rust to replace that kind of language? (Excuse the pun.)

The thing that I take from this thread is:

Please don’t name this type after ‘Vector’, ‘Array’ or a variant thereof. It is neither of those things.

‘Cartridge’ or ‘Magazine’ or one of the suggestions that @lorenty has mentioned would be good.

Something that suggests a batch of the same type that cannot be added to/removed from.

As he said already ‘Pack’ would have been good if not already used. ‘Bar’ as in chocolate. Something like that.

Let’s move away from the array/vector dichotomy.

I find it very healthy that this whole discussion can happen out in the open. Well done Core Team.

(From an idiot who doesn’t have a CS degree but always found the Cocoa nomenclature very approachable and understandable.)

@clayellis ‘Bunch’ seems very good.

It fits the definition of “a set of things that are all of the same type but don’t conform to the shape of a/an collection/set/array etc. Where also the containing type is immutable.”

Without going into the theory of Vector Spaces, consider Vector as a finite sequence of things of all same type. What other name can describe it so succinctly?

See: MIT OCW - Linear Algebra

4 Likes

Apologies, perhaps I should have said that it is “not either of those things unambiguously”.

I am not a mathematician above A-Level (== sub undergraduate level).

I have come across the term ‘vector’ in several fields (maths, physics, aviation, meteorology, biology, etc…)

Swift is not used in a purely mathematical context. Reducing ambiguity is surely a goal in naming?

(Edit). Isn’t there something in the API guidelines about valuing clarity over succinctness?

I'm looking forward to indexing homogenous tuples without unsafe pointer tricks.

Tangentially, "vector" means "carrier" in Latin, and it is the root word for "vehicle" (Âą grammar), so you can reasonably call any non-void type an information vector/carrier. I would say force vectors make excellent use of the word, however, since carrying a physical object is an act of force. In any case, I first think of homogenous tuples for math and programming reasons. I don't think the lack of vector space operations is disqualifying because you can model vector spaces in many ways, and only some require operators defined on an exclusive element type. You could, for example, write a vector space class or struct with appropriate transformations instead. In that case, a vector is just a payload.

struct AlternativeVectorSpace {
    typealias Element = Vector<2, Double>
    func  adding(_ a: Element, to b: Element) -> Element
    func scaling(_ a: Element, by b: Double ) -> Element
}
2 Likes

It is very common that we use words differently in computer programming than mathematicians use them. "Array" comes from mathematics, but mathematicians hardly use it except to talk about the visual grid that's commonly used for matrices. Meanwhile, it's a core concept in computer science.[1]

In math, "vector" is pretty much exclusively used for a member of a linear space. All of our uses in CS are inspired by that, but they end up giving pretty different spins on the idea. And the underlying mathematical idea is not constrained — polynomials act as an infinite-dimensional vector space.

So in CS, we have:

  • computer graphics, which uses "vector" properly as a mathematical difference between points, but which almost exclusively uses vectors of exactly 2 or 3 dimensions, and which commonly needs to distinguish vectors from points;
  • computer architecture, which uses "vector" generally for SIMD, both vector architectures (which typically operate on dynamically-sized data sets) and SIMD extensions to scalar architectures (which have special layout requirements as data types and are typically not indexed dynamically);
  • machine learning, which I am not an expert in, but which appears to use "vector" extremely loosely and often for dynamically-sized vectors; and
  • programming languages, which use "vector" for two main purposes:
    • SIMD types, including every shader or hybrid-compute language I can find (GLSL, HLSL, OpenCL, etc.), C#[2], and a lot of C extensions and libraries built with those extensions; and
    • dynamic array types, including Java[3], C++[4], Rust, and Common LISP.

Swift's Vector type would almost certainly just offer a collection-like API in the standard library. This is an excellent and perfectly satisfactory choice for cases such as the importing of array types from C. However, it makes it a poor match for graphics and (perhaps) ML. Those libraries could add the operations they want with extensions, but that would conflict awkwardly with the other use cases: we would not want an imported char [MAX_PATH] field to suddenly offer vector math when used from code that also imports a graphics library. Wrapping is really the only choice, but then it would become very annoying that the standard library already offers a type called Vector.

I do not think a 2 or 3 word name of 10-20 characters (such as FixedSizeArray, although I understand the argument for avoiding Array as a suffix) poses a significant usability burden for this type. We often find ourselves using names of that size when we aren't specifically looking to introduce a new fundamental term of art into the lexicon, as we did with Task and Span. I do not see a compelling motivation to introduce a term of art for this type, other than that it is the proposal in front of us right now, which always makes a feature seem more important. We do not expect this to be the root name of a family of types, as Array and Span are. It is not expected to be an especially prominent type; it has a special role in language import, but so do @convention(c) function types. We are already talking about future directions like InlineArray as likely being more interesting for embedded programming in the long run, and that type does not appear to require a term of art. And we expect that this type will frequently conflict with the Vector types that library authors will need to offer on their own.

I don't know what the name should be, but I feel strongly that it should not be Vector.


  1. I would dispute that Swift's Array actually matches this concept; there is a subtle but important difference. The classic understanding of an array is a contiguous region of memory that's meant to be interpreted homogeneously. Swift's Array implements the dynamic array abstract data structure, which owns an array but can replace it if it needs to. However, it presents the abstraction of an array: you can use it as if it were an array that can magically grow and shrink. That makes it a fine name, but we shouldn't talk about it as if it controls the basic concept of what an array actually is in all of computer science. ↩︎

  2. C#'s Vector type is a SIMD type of target-dependent size which is not meant to be used outside of the implementation of SIMD algorithms. ↩︎

  3. Java has deprecated java.util.Vector, but for reasons completely unrelated to the name, and I would guess that many Java programmers are still familiar with the type. ↩︎

  4. Stepanov says that he regrets the name std::vector. However, the existence of primitive C arrays would have made std::array a questionable name for it, especially in 1993. Languages with dynamic arrays follow a clear split in naming. C++, Java, C#, Smalltalk, and Rust all offer fixed-size arrays, and in each case that's what "array" means, forcing the use of a different term for their dynamic array type (if they have one). Almost every language that calls their dynamic array type Array is a dynamic language that doesn't offer a fixed-size array; until now, that included Swift, although Swift likely would have followed Objective-C's lead here even if it had had a fixed-size array type in 1.0. ↩︎

20 Likes

Yes I wrote Pod at first, but changed to PeaPod to make it clear that it is not a cocoa pod - the fruit - have a higher entropy in its arrangement than its rosids-cousin the pea pod.

So obviously I would be happy with Pod (and again, I'm happy with Vector)!

If the repeated argument that Vector would be confusing "wins", then we might as well go for something which is not "loaded" and which conjures a pedagogic visualisation - "memorable" as you put it @lorentey !

Objective-C was my first language I used professionally, I had coded a bit of Java in University and I had used HashMaps - but that type name did nothing for my memory and mental model. Perhaps just how my weird brain works... but when I started coding Objective-C a lot and I encountered NSDictionary, that name just "clicked" for me. I believe Objective-C was first calling maps "Dictionary" (3 years before Python).

15 years later - I still visualize looking up a word in a dictionary when I work with maps... Dictionary, such a memorable and great name - HashMap not so much (once again, maybe just how my weird brain works. I'm a visual learner.)

2 Likes

This thread has (predictably) gotten tied down in naming arguments, so I wanted to resurface some considerations on the overall approach we are taking instead:

Naming aside, I haven't seen much in the way of an argument as to why this functionality couldn't be built on top of existing tuples in the language by modifying the language in a few key ways. To me, this feels like the natural representation that is simply lacking a) an interface for extensibility and b) sugar for brevity. Even if all we do is start with the homogenous representation of tuples for this new functionality, it keeps the door open to generalize this to non-homogenous and named tuples over time.

Introducing a new type works in a pinch, but I imagine the language becomes much more expressive if we enhance it to support working with tuples in ways that we are used to working with every other type.

1 Like

Hi, Dmitri. The overall approach was accepted after the first review; Freddy laid that out in the announcement. This review really is just focused on the name, although feedback on the other revisions mentioned in the announcement is also on-topic.

4 Likes

I like Bunch, but I like Vector more.

Actually, Bunch might have fit well into a language called ABC, Python's greatest inspiration from the early '80's. It was meant to replace Basic as a teaching language.

Bunch is cute, but Swift is going hardcore.

I would take the forementioned Florglequat over Bunch. Imagine telling C++ developers their Array replacement is called Bunch in Swift because there is a bunch of values inside and we couldn’t agree on a better name.

On this note, what will happen when the review period ends? The introduction of such type has been accepted but I‘ve yet to see any sign of only the slightest agreement on the name. Will it be an executive decision by the language steering group? I know the final decision always lies there but normally there is at least a broader conclusion in the thread.

I‘m still more in for ConstVector or even ConstArray as I already mentioned further up.

1 Like

There are also interrupt vectors, which are basically function pointers, so "vector" in that sense means pointer. Although it's not a container, it's not unreasonable that somebody might want to use the name Vector in that way.

Although I like the names Tuple or Vector better, they are shorter, I like these too: HomogenousTuple or FiniteSequence or OrderedFiniteStore.

1 Like

We also never used the word Array for a type storing its elements inline, but there's apparently an InlineArray in the pipeline.

An array where the count is a compile time constant doesn't seem that different from Array. I really don't understand why you can't see it being part of the same type family just because count can't change at runtime. Looks like an arbitrary line to me to say "the language did not support this kind of array before, so we can't call it an array".

Meanwhile I'll half-jokingly suggest ArrayTuple. To me this type belongs either to the array family, the tuple family, or both.

4 Likes

Thank for such comprehensive explanations and thoughts.

After all the only argument I have is that Vector name will rather often be used in domain specific projects, I suppose.
It will be very annoying if someone want to use term Vector as a project-specific term, but it is reserved by standard library. This circumstance will force to invent other, less convenient names in narrow-specific projects.

Some examples of such reserved names:

  • description in CustomStringConvertible. In one of my projects description was frequently used as domain term. It was used in project documentation, backend and android. But Swift development team needed to invent something other like 'descrInfo', 'info', 'detailedInfo'... Everywhere except swift codebase description was used, and only in Swift we had a miriade of different names, because CustomStringConvertible.description property is not supposed to be called directly, and plenty of our data structures conformed to it.
  • id term problem in Obj-C, which was luckily solved in Swift. A lot of struct / classes need to have an id property.

I don't want such problems were repeated with Vector.

What I see in this thread is that some people agree that Vector is not the best name and are ok with some other name, including me.

The problem for now is that saying "I'm ok with other name" nothing is offered as an alternative.

May be we should make something like a poll with all alternative names, excluding the Vector itself.

2 Likes

But that could conjure up ([...], ..., [...]) or [(...), ..., (...)]