SE-0229 — SIMD Vectors

there is potential confusion between = and == too. plus, they don’t even return the same type.

Also i don’t think postfix . is allowed

2 Likes

The fact that there's a problem with = and == is not relevant. That's not changing; the question is whether to add a new problem.

I'm not sure whether this potential confusion matters, but I am pointing it out so the community can consider it. I don't yet have an opinion.

As for the postfix dot idea not working, you are correct.

Postfix . is a special case in the grammar for operators: Operators can include a dot, but only if they begin with a dot.

From TSPL > Operators:

You can also define custom operators that begin with a dot ( . ). These operators can contain additional dots. For example, .+. is treated as a single operator. If an operator doesn’t begin with a dot, it can’t contain a dot elsewhere. For example, +.+ is treated as the + operator followed by the .+ operator.

dot-operator-head → .

dot-operator-character → . | operator-character

dot-operator-characters → dot-operator-character opt

My notes say say we added compiler support for this special case in commit 0bfacde2420937bf.

I mentioned this in the discussion, but not the review thread. I believe it is possible with the right protocol structure to have the Vector4<T> syntax be extensible to non-SIMD types (without hardware acceleration for non-SIMD types, of course).

It basically requires having separate protocols for the storage part and @beccadax's "vectorizable" types.

A couple of people had mentioned being bothered by only being able to stick certain values in the generic slot, and this would remove that restriction (as long as the type can support the vector operations in code).

1 Like

I don’t think a generic parameter carries any implication that it can accept any arbitrary type (e.g. ‘Vector4< String>‘). It is quite common for generic type parameters to have constraints.

Wrappers like CGFloat are a little problematic and would probably require a level of indirection to access the underlying, vectorisable value. But I agree - with the right design, I think it’s possible.

It would be a bit easier if we had “sealed” protocols in the language. It’s badly needed anyway.

4 Likes

Review Update

The core team met and discussed the feedback received so far, and has made the following recommendations:

Intention to accept

The core team feels this is an important addition to the language that will open up SIMD programming to a wide audience in an approachable way.

The core team also made the following decisions for when the proposal is accepted:

  • There were many requests for additional math operations. These are certainly important, but can be left to later proposals once this foundational proposal has landed.
  • The initializers for VectorN from an Array should instead be generic from any Sequence.
  • The . prefix should be used on all mask-producing operators:
    • In case of .==, this helps disambiguate between the Bool and Mask-returning forms
    • In case of .& and .| this helps resolve the precedence problem of & and the inconsistency of a non-short-circuiting &&
    • While it doesn't have similar motivations, for consistency .< & co should also have a leading dot.
  • Pointwise arithmetic operations will not have a leading dot.
  • To avoid confusion with Collection semantics and future types like matrices, count should be renamed elementCount
  • The element properties x, y, z and w will be available on vectors of up to 4, along with the common named swizzles even, odd, high and low.
    • The general swizzle operation init(gathering:at:) feels unsatisfactory, and so should be deferred for a later proposal after more thought.
    • Brute-forcing all possible swizzles on the smaller vectors as properties was also ruled out.

Prototype of "generic"-style vectors

The majority of the feedback received during the thread was regarding the alternate "generic" spelling: Vector3<Int8> rather than Int8.Vector3.

It is still unclear which is the better form. However, in order to better make the decision, the core team has asked the proposal author to implement a prototype showing the alternate form. Reviewers will then be able to try out either form in order to help make the decision. (since @scanon is on vacation this week, @moiseev is kindly helping out with this prototype)

In addition, the proposal should be revised to spell out more explicitly some of the details, for example, of what masks are and the role they play.

We will hold the review open pending that prototype, and I'll post again when it's available.

21 Likes

Review Update 2

Hi – just to let everyone know: the updated proposal should be ready early next week.

In the mean-time the core team discussed some of the remaining questions. In order to land the ABI-relevant parts of the change soon, the team decided to defer the any and all parts of the proposal to a separate review, similar to the generalized swizzling options, as these do not affect the ABI so can be introduced additively later.

7 Likes

Review Update 3 – Resuming following changes

Hi everyone – thanks for your patience. The proposal and implementation have been updated, and the review will now resume through Friday, November 9th.

An updated copy of the proposal can now be found here, and diffs from previous are here.

To summarize how the proposal has changed:

  • The primary working types are now spelled like Vector3<T> instead of the earlier T.Vector3.
  • Initializers from any Sequence with the right element type are now provided.
  • All mask operations are .-prefixed
  • count has been renamed elementCount
  • The general swizzle / shuffle / permute operation init(gathering: at:) has been removed. We intend to restore it in a later proposal with a better name.
  • Users can make VectorN<T> available for arbitrary types T by conforming T to a new SIMDVectorizable protocol, which has very basic requirements.
  • The any and all and min, max, and clamp free functions have been removed. We intend to re-introduce this functionality (possibly with different bindings) in a follow-on proposal.
  • The IntegerVector and FloatingPointVector protocols have been removed and replaced with conditional conformances.

The text of the proposal now goes into detail about how the new spelling works and how users can add support for new types.

8 Likes

Small bikeshed: I wonder if we can rename SIMDVectorizable to simply Vectorizable. It might help with discovery and draws some parallels with the Vector name rather than SIMDVector.

This is looking really good, but I can’t see the definition of the SIMDMaskVector protocol, just its extension? Sorry if I’m being dumb!

This link should take you to the declaration: [DNM] SIMD, take 2 by stephentyrone · Pull Request #20344 · apple/swift · GitHub

(But it looks like most of the implementation is in extensions.)

Thanks @krilnon!

SIMDVectorizable has underscored requirements without defaults, so this doesn't really seem to be the case. Which makes me sad, because I think Vector4<CGFloat> (for example) might make sense for some users.

1 Like
public protocol SIMDVectorStorage {
  /// The number of elements in the vector.
  var elementCount: Int { get }

What's the motivation behind making this an instance property rather than a type property?

3 Likes

Dave has expressed mild concern with using the names Vector2<T> , etc, instead of SIMDVector2<T> . I don't think that this is a significant concern because the new Vector2 implemented here should be suitably general to function as "the two-dimensional vector" of almost any type in almost any setting. It has more operations than some use cases require, but they will usually not interfere with the desired operation, and it's useful to have a common currency type available for small vectors.

Is that really true? The proposal requires elements of the vector types to be at least Hashable (which sounds like it makes sense) and SIMDVectorizable (which requires a bunch of scary underscored typealiases).

Beyond that, if these vectors should take the role of general vectors, where is Vector5 or Vector9001? You see where I'm going.

These vector types seem to be modeled very closely after simd, and I think either the name should make that clear or (preferably, I think) all this stuff should go into a SIMD module, rather than the standard library (which would also allow us to drop the prefixes on some more of the types).

5 Likes

There's nothing stopping you implementing these underscored requirements. So it really depends on what an underscored requirement means – and what it means is really what we chose it to mean.

To-date, underscores on things in the standard library means one of a few things:

  1. This should be an internal implementation detail, but we lack language features to make it so, so it's public but underscored instead.
  2. This is subject to change, full rights reserved to break/change it in the future. Use at your own peril.
  3. This is something you want to be able to implement on the protocol, but shouldn't be exposed to users of a type.

There's a lot less call for 1 these days, and 2 becomes vanishingly useful once we declare ABI stability.

That leaves 3 and we don't have a good way of spelling it. Another example would be Sequence._customContainsEquatableElement(_:)->Bool?, which a sequence implementation that has a better way of implementing contains can implement. It's a hook that the constrained extension on Sequence where Element: Equatable that supplies contains can call, giving you dynamic dispatch even though not all sequences contain equatable elements. We want to expose that to implementors of types that conform to Sequence, but a user should never see it on the concrete type when they're using it. So we underscore it. But you can still override it if you want to. It would maybe be better to do it in some more official way (say, with an @implementationDetail attribute or something).

The difference, like you note, is that the current things like this on Sequence have default implementations, whereas these don't. I don't know if that's a particularly critical dividing line. Ultimately you are going to need a fairly high degree of sophistication to make a type vectorizable. Having to know about these underscored customization points doesn't seem that big a deal. You will still get told about them if you try to conform to the type. Really all they hide stuff from autocomplete – which is what we want. It would be confusing for users of these types to see these things appear both as a nested type on T and as a top-level Vector3<T>

5 Likes

The meaning of a leading underscore isn't solely decided by the Swift project; it's also part of the context of platform headers and Apple SDKs. For an external developer, underscores mean #2, even if in practice some sops to compatibility are made. We really, really shouldn't muddy that story, which is already pretty subtle.

If "this is something you want to be able to implement on the protocol, but shouldn't be exposed to users of a type" is important, then we should come up with some better way to model this.* But it shouldn't be an underscore that developers, any developers, are expected to type.

EDIT: can't believe I forgot underscored language features, like @_specialize, which will change.

* I'll note that this is very similar to the justification for protected, and so I'll include my usual pushback: if you limit these to conformers in a strict way, then it's harder to write helper functions.

2 Likes

Ultimately I don't think I buy the argument that these don't belong as nested types. We have two ways to spell plenty of things, like String.SubSequence and Substring. That's just how associated types work in this language.

If we're actually worried about saturating documentation or code completion, especially for first-time users, well, I think we should address that directly (and separately). Swift style (largely inherited from Objective-C) is to add functionality using members, and that's one of the downsides. This proposal shouldn't go out of its way to work around that at the cost of violating other existing conventions.

1 Like

It's important to be clear that these are not another way to spell the same thing; they're a thing that users of Vector4<T>, as opposed to people trying to write their own vector types, should never need to see or think about. T._Vector4 is the storage type, with essentially no operations defined on it; Vector4<T> is the thing that provides all of the arithmetic operations for you.

I would be OK with removing the leading underscore if we explicitly named this T.Vector4Storage or similar. Calling it T.Vector4, to my mind, pollutes the user-surfaced types with a thing that almost all users shouldn't ever need to touch.

We could also nest them further, inside a SIMDStorage enum namespace.