[Second review] SE-0453: Vector, a fixed size array

xwu · December 6, 2024, 8:24pm

Hmm, that we have CGVector is (to me) a nice example where we're already using the term in Swift APIs to refer to a type of fixed count.

In that case, as a Core Graphics vector (since the library is explicitly oriented towards 2D rendering), it makes sense that it's a vector of two values, and that the values are of floating-point type.

I think it fits very nicely that a generic Vector type in the standard library would generalize both of those constraints.

benrimmington · December 6, 2024, 8:50pm

Should the following with argument label be removed? Either init(next:) or init(_:).

extension Vector where Element: ~Copyable {
  public init<E: Error>(with next: (Index) throws(E) -> Element) throws(E)
}

Should the additional Span subscripts also be proposed for Vector?

extension Vector where Element: ~Copyable {
  public subscript(_ position: Index) -> Element { _read _modify }
  public subscript(unchecked position: Index) -> Element { _read _modify }
}

extension Vector where Element: BitwiseCopyable {
  public subscript(_ position: Index) -> Element { get set }
  public subscript(unchecked position: Index) -> Element { get set }
}

Should the additional Vector APIs also be proposed for Span?

extension Span where Element: ~Copyable {
  public typealias Element = Element

  public var startIndex: Index { get }
  public var endIndex: Index { get }

  public borrowing func index(after: Index) -> Index
  public borrowing func index(before: Index) -> Index
}

John_McCall · December 6, 2024, 8:56pm

It's notable, however, that CoreGraphics does exactly what I talked about in the pitch thread as a minimum for a well-designed graphics API: it provides distinct types for vectors (CGVector) and coordinates (CGPoint) rather than conflating them just because they both happen to be two-dimensional.

This sort of thing is exactly why I believe that the type described here is primarily going to be used as an internal implementation detail of other first-class abstractions rather than as something that will see much direct use.

Alejandro · December 6, 2024, 9:17pm

Is there anything wrong with that…? Atomic and Mutex are also designed as implementation detail types that help build safer abstractions that you generally wouldn’t ever see vended from a public API.

John_McCall · December 6, 2024, 9:24pm

There's nothing wrong with it, but it certainly informs our discussion of the name. If this type name is going to be written exclusively as the type of a single private stored property in a handful of library types, it could be fifty characters long without making the slightest difference. We absolutely should not give it a name like Vector that would invite confusion with the actual currency types that programmers are meant to be using.

I'm frankly blanking on what other uses there are for this. It seems like most of the no-allocation use cases, like Embedded Swift, really need a FixedCapacityArray (or perhaps a RingBuffer), which we're all in agreement is a completely different type that cannot be implemented in terms of this one. And even then, my experience is that you really want to hardcode the capacity in as few places as possible — when you're passing around that data structure, you always want to use some Span-like type that makes the capacity dynamic.

clayellis · December 6, 2024, 9:44pm

Could we draw inspiration from the existing CollectionOfOne<Element> type and move the "size" suffix into the integer generic and name it CollectionOf<N, Element>?

"A coordinate is represented internally by a collection of two Ints."

xwu · December 6, 2024, 10:10pm

Since the type wouldn’t conform to Collection but rather its noncopyable successor, we wouldn’t be able to name this type until such time as we’ve settled the replacement protocol hierarchy with such an approach.

cbarrett · December 7, 2024, 3:31am

First off, thanks for writing this proposal. I'm excited to see it and I hope it's just the start of dependent types making it into Swift! I've been studying dependent types for a number of years now and I think they're (a big part of) the future. Writing down your specification in the language of types, which then all-but-forces you to pick an implementation, is a really satisfying workflow and the result is quite readable.

What is your evaluation of the proposal?

+0, as I have a few questions.

Are the only methods that will be present on this type the ones that are presented in this proposal, particularly the ones under "Generalized Sequence and Collection APIs"? I'm writing my review assuming the answer to this question is "yes".

What level of static computation is available? Can one write append for vectors of length N and M? How about repeat(k) which gives a vector of length k*N?

Is the problem being addressed significant enough to warrant a change to Swift?

Absolutely, yes. This is one of the most basic dependent types and if Swift is going to support complex and subtle static requirements such as lifetimes, having dependently typed containers will likely be crucial. Plus, there are a million other reasons to support dependent types, I won't go through them all here.

Does this proposal fit well with the feel and direction of Swift?

Given the lack of conformances, I think the answer is unfortunately no.

While it's possible to write most collection-manipulation code using the handful of methods provided, it's not going to be as pretty and certainly won't be idiomatic Swift. Porting code that manually loops over indices or re-implements library functions using swap, especially if your code has to compile against more than one version of Swift (for example in a third party library), will be pretty painful.

If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?

I've used a number of dependently typed languages, most notably Agda. I like that the name Vector is the same as you'll find in the literature on dependent types.

How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

Read through the proposal.

If there is some need to put this type in place now in its nerf'd state, I understand, but that hasn't been articulated well. I'm not against this type going in per-se, but I think it will be received better and adopted faster if it's introduced in a more complete state.

If my assumption (at the top of this message) is wrong, and a more complete set of methods will be available on the type despite it not conforming, then I'm probably in the +1 camp. Likewise, if there's some other feature that needs this, I'm not against this going in.

scanon · December 7, 2024, 4:11am

Absolutely none, but this is really a question about the related "integer generic parameters" proposal (SE-0452). Some forms of what you call static computation are discussed as future directions in that proposal, but the initial feature (on which this proposal is based) does not support it at all, and there are some significant limitations on how type-checking would interact with such things in Swift (discussed in the proposal as well).

It is unfortunately not possible to write any of the protocols that this type and other "collections" supporting non-copyable elements would conform to in the language we have today. Writing those protocols requires development of the lifetime rules and of new accessors. Both of these are in the works; see, e.g. future directions on SE-0446 as well as the modify and read accessors pitch and the draft vision document linked from that thread, but these are subtle projects and will take some time and care to get right.

Writing those protocols also requires having some concrete types to use as building blocks and examples, and those include Span and Vector. They are also sorely needed for some of the more constrained environments that Swift targets, where Array is not available. For these reasons, Vector and Span bring considerable value, even with a "nerfed" API. So it makes sense to do what we can for some of these concrete types, then build those protocols in terms of them, and then turn our attention back to rounding out the types.

ksluder · December 7, 2024, 4:36am

I wish I had more to offer than just name bikeshedding, but I’ll just re-suggest Inline or perhaps InlineBuffer.

ibex · December 7, 2024, 6:25am

A collection in general does not impose any ordering, but a tuple does. So, maybe the name Tuple could be a better candidate.

cbarrett · December 7, 2024, 6:43am

I see, this is very helpful to know. Thank you! Definitely pushes me closer to the +1 camp.

I understand, that was clear in the proposal. Would it be possible at least to have a few more methods implemented, some of the most common ones that people would want? Or if implementing those is too complicated, due to language restrictions, maybe that could be documented somewhere, in this proposal or in the type itself. I'm sure there will be people eager to experiment with these types, and first impressions matter.

Thanks for the pointer, looking forward to reading that.

hassila · December 7, 2024, 12:14pm

We could still draw inspiration, e.g. StorageOf<N, T>

I think @John_McCall has a good argument why this name should not be the attractive type to reach for as a currency type in general.

I also like Inline as mentioned earlier.

scanon · December 7, 2024, 3:09pm

it's not really that implementing them is too complicated; it's that any bad implementation/naming choices we make will constrain what we can do in designing the protocols.¹ There's a chicken-and-egg problem here, and the viable options are basically:

Wait to land any of it until it's entirely finished. No one ever has to deal with a "nerfed" type, but this also means no one gets any pieces of these things for a long time yet, even the pieces that would be useful on their own. It also means that the folks implementing the features don't get as much incremental feedback on their design, so they're somewhat more likely to wander down blind alleys.
Iteratively land what pieces we can. Chip away at the concrete types, then the language features, then the protocols, and repeat. Gradually unlock various use cases and get incremental feedback.

We're trying to pursue option 2 as far as we can, because these types are genuinely useful even without the protocols and the rest of the API surface. At some point we may get stuck and have to do a big chunk of stuff all at once, but so far chipping away at things is working OK.

¹ it's tempting to say that we can "obviously" make sense of the most straightforward API on these; surely we can define reduce on Vector! But, no, not really--in our ~Copyable world, we will want to have both a consuming and a borrowing reduce, and the protocols need to let you specify which one you're talking about. It might make the most sense for one of them to be the default with the blessed familiar name. It might make sense to have a view that selects the one you mean. These are the sorts of questions that need to be considered more holistically in the context of protocol designs. Various folks have already spent a lot of time working on these details, but we cannot actually test them out fully and move proposals forward until some of the necessary language features are available.

technogen · December 7, 2024, 7:25pm

I completely agree, which is why I don't think that having a more verbose, but clearer name would be inappropriate. To further this point: I don't think it's reasonable to say that UnsafeRawMutableBufferPointer is too verbose.

xwu · December 7, 2024, 8:10pm

It most certainly is too verbose—as a demonstration of which, you've got it wrong: it's UnsafeMutableRawBufferPointer.

technogen · December 7, 2024, 8:30pm

Oh, I got it wrong! This is a perfect example of what @John_McCall was talking about (as I understand, please correct me if I'm wrong): It doesn't matter how long the name is or how easy it is to make a mistake in spelling, because you only ever see these low-level types in implementation details that are hidden behind domain types that provide a curated interface. In a reasonable code base, full of domain logic you're far more likely to make heavy use of some sort of a RawRepresentable<String> rather than a String, making the underlying String interface far less important than what is built on top of it.

tera · December 7, 2024, 9:18pm

I like this one. Or a shorter FixedArray.

I'd reserve "Vector" for graphics
It's just a name.. even if it was called, say, BooGaDa(exaggerating!), after a little while there would be a strong connection between the name and the thing.

James_Dempsey · December 8, 2024, 4:35pm

I don't believe there is anything wrong with it, but I think it is important to note that both Atomic and Mutex require importing the Synchronization module, and so developers need to explicitly opt in to using those types.

Would requiring an explicit import, as done with Synchronization, remove this concern since a developer could not easily use the wrong type, find it via autocompletion, etc.?

And, although I realize other types named Vector take precedence over Swift.Vector, requiring a specific import for Swift.Vector should greatly reduce the number of naming collisions where the compiler needs to disambiguate which Vector is intended. The remaining cases should largely be source files where both Vector types are in use and one would need to be fully qualified - and for readability, both would probably be fully qualified.

James_Dempsey · December 8, 2024, 5:00pm

The term 'code block' is also used as part of the grammar of the language in The Swift Programming Language and used in various places throughout the text:

Code Blocks

A code block is used by a variety of declarations and control structures to group statements together. It has the following form:
{
   <#statements#>
}
The statements inside a code block include declarations, expressions, and other kinds of statements and are executed in order of their appearance in source code.

Grammar of a code block

code-block → { statements ? }

From TSPL on swift.org.