[Second review] SE-0453: Vector, a fixed size array

The existing functionality could belong to a variadic generic type with the form Tuple<repeat let count: Int, repeat let label: String?, repeat T>.

(At one point during the discussion of variadic generics, constraints involving the length of variadics were also mooted. If such a feature is added, the type signature could be refined: Tuple<repeat let count: Int, repeat let label: String?, repeat T> where length(repeat each count) == length(repeat each T).)

It may or may not be practical to introduce a Tuple<repeat T> that evolves this functionality over multiple language versions.

One thing I hadn’t considered is that this would constrain the tuple representation on all platforms. Right now, only Darwin is committed to retaining the layout equivalence of tuples and C arrays due to ABI stability.

Would the following then be allowed? If not, I wouldn't like to say: Do you mean tuple or Tuple for implementations.

func foo(x: (Int, Int)) {}
let bar = Tuple<2, Int>(1, 2)
foo(x: bar)

Yes, that is indeed the end state I am putting forward.

4 Likes

That's not the worst thing in the world, I guess. However, I think what should have been proposed first would be Tuple<repeat T> while Tuple<2, Int> is just sugar for the homogenous type and we still have the same APIs as currently proposed, somehow.

1 Like

Yes, practicalities like that might make my proposal infeasible even if the LSG agrees with it in theory. At which point I’d fall back to my original suggestion of InlineBuffer.

(To further justify my original suggestion: I like having the word Inline in the type to emphasize that an InlineBuffer<200, Int128> stored property is going to make your type fairly expensive to copy.)

2 Likes

Given the overly-verbose HomogeneousTuple, I'd like to submit Huple as an alternative.

          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”           
          β”‚    Huple     β”‚           
          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜           
                  β”‚                  
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        
        β–Ό                   β–Ό        
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” 
β”‚    Tuple     β”‚    β”‚ InlineArray  β”‚ 
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ 
                            β”‚        
                            β–Ό        
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚     Array     β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
9 Likes

Allowing subscripting via .0 etc doesn’t seem like it would be impossible for this type (at least in non-generic contexts where the count parameter is a static value)?

The future direction of ExpressibleByVectorLiteral suggests it might be possible to use parameter packs - which have a link to tuples, so maybe using a tuple literal syntax for this type isn’t actually a bad idea?

Tuples are homogeneous traditionally... (hmm, are they?) If we want it badly we could always refer to the current swift tuples as "heterogeneous tuples" freeing up the word "tuple" to mean homogeneous by default.

Yeah, you could imagine a world with an opaque byte type that would solve that specific example. It does generalize, though β€” I would say that most fixed-size array types in C are either implementation details of abstractions that you typically use opaquely (like a UUID) or act as buffers rather than being something you use as a whole. This is reinforced by the language in a way: C does not allow you to directly use fixed-size arrays as aggregate values because of array-to-pointer decay, so if you’re trying to do something UUID-ish, you must wrap it in a struct, in a rare example of C accidentally encouraging fine-grained typing.

3 Likes

This, to me, is the clincher that this type should not be called Vector. This type, while useful, doesn't feel like one that would be exposed as a part of public API for a package; you would want to wrap it in a package-specific type to better convey what your intention would be.

For example:

  • the names of a Mach-O segments and sections are Vector<16, UInt8>
  • A UUID is a Vector<16, UInt8>
  • vImage uses Vector<16, UInt8> for certain kinds of pixel conversion functions
  • an FSCatalogInfo exposes Finder info values via Vector<16, UInt8>
  • some networking frameworks expose IPv6 source and destination addresses as Vector<16, UInt8>

For none of these would you want to actually expose the values directly as a Vector<16, UInt8>; you would want to wrap them into MachO.Segment.Name structs or UUID or PixelMap or FinderInfo or IPv6Address structs that are backed by a Vector16, UInt8>, so you could expose type-specific functionality for the domain.

I have a hard time imagining a scenario where a Vector<n, Type> would be the right thing to use as a bare public interface, as opposed to an implementation detail (that perhaps might be exposed as a property for use-case-specific situations, like as a rawValue).

Because of that, and because of past history where we have regretted name-squatting things (CountedSet, OrderedSet, CharacterSet, etc immediately spring to mind), I think we cannot use the name "Vector" to name this type. It's already been shown many times in this thread that there are many valid use cases for the name "Vector" that are domain-specific that this type cannot encompass in its public API.

ConstantSizeArray, FixedCountStorage, NonResizingArray, etc all seem to be far superior names in order to provide a foundational type for package developers to rely on to build their own public interfaces.

11 Likes

Why use an "xArray" name like two options above, if every use case you mention above is an "InlineBuffer". Also, echoing @lorentey, this type is not an array.

2 Likes

Like I've said before, I'm unswayed by the argument that the core proposition of an "Array" is the ability to make it a different size. A const char[16] is an array, even though it cannot be resized. Ever other operation we use with arrays we would use with this one: maintaining order of elements, reading elements in O(1), replacing elements in O(1), etc.

I don't know if that matches up with the "mathematical" definition of Array or whatever, but we're not computers who only speak math. We're building an API to be used by people, the vast majority of whom do not have advanced degrees in nor a deep understanding of mathematics.

Regardless, "Buffer", "Storage", etc are acceptable suffixes to use for this as well IMO; anything except "Vector".

9 Likes

The implied premise here, I would take it, is that only types named in public APIs need to have "nice" names. I would disagree with that premise.

As you show here, the proposed type has many implementation-level uses, and even if (arguendo) it never appears as part of a public API, having an easily spelled and easily pronounced name would vastly improve the developer experience.

IMO, half of the pain when working with UnsafeMutableRawBufferPointer APIs comes from constantly naming the type or its relativesβ€”which, given how it's unsafe, can be stomached somewhat. This, however, is supposed to be a safe, performant alternative (for a subset of use cases), and I would not want it to be pessimized the same way.

If there are many domain-specific (and hence mutually incompatible) uses that can validly lay claim to the name "Vector," then I would say that for clarity none of them should be using the unadorned name: they ought all to be UIVector or MLVector or some suchβ€”it's by no means absurd that someone would want to write a machine learning app with a user interface, which also happens to interop with C/C++/Rust code and does some SIMD computing.

Indeed, I would then argue that having the standard library lay claim to bare "Vector" so that users are discouraged from declaring mutually incompatible Vectors would be a plus and not a minus.

19 Likes

The concern isn’t about different helper libraries trying to define their own currency type like String. The problem is that using the name Vector for a currency type clashes with one of the core types that a graphics or geometry library exists to define. And it just so happens that many programs which would use such a library would also like to be direct clients of Swift.Vector.

I do agree with this. Swift feels like it is actively questioning me every time I want to declare a pointer, which seems undesirable for a systems programming language. Whatever name is chosen, it should be succinct.

4 Likes

I have another name to consider: Fix. Why I like it:

  • It (hopefully) is easy to reconcile the name with a type having a static size and requiring all slots to have valid elements.
  • I find it appealing that the name makes some sense just from the β€œfasten” meaning of β€œfix”, as in β€œfive ints fixed (fastened) together” for Fix<5, Int>.
  • It doesn’t have an existing meaning as a data structure in programming languages, as far as I know[1]
  • It doesn’t imply any functionality other than holding its contents

[1]: There are functions named fix existing in some programming languages, but my brief search turned up multiple seemingly unrelated functions with the same name, suggesting no single accepted meaning even as a function.

Just give another reason to support the awesome naming Vector:

For Qt 6.0, the QVector is an alias to QList, so the meaning of Vector in the context of Qt will not conflict with any Qt datatype when Qt 6.0 gets adopted in the coming years.

https://www.qt.io/blog/qlist-changes-in-qt-6

Just wanted to suggest possible names "Invarray" and "Procrustean".

1 Like

Given this backdrop, are there any blockers in pursuing @dimi's ideas in some other way? I think I like it, or at least I'd like to explore that idea further.

Could we envision some future where however we end up spelling it, for all practical intents and purposes the type-at-hand and tuples become more or less synonymous?

Could we e.g. some time in the future say that Vector<3, Int> is (Int, Int, Int)?

Or make the distinction unimportant by adding stuff like ExpressibleByTupleLiteral to the type, toll-free bridging, adding new tuple literal shorthands such as (3 * Int), making tuples extensible, and so on?

1 Like

I think @John_McCall s point is that it has already been litigated and the ship sailed for that discussion with the first review - it is very hard to bring anything to a complete review if we reopen discussions and reset the overall approach at late(r) stages. (not commenting on the merits of that idea specifically)

1 Like

I understood that, and that's why I'm asking about other paths to a similar future, that does not involve altering the semantics of the type-at-hand.

1 Like