SE-0229 — SIMD Vectors

I would say the best way to tackle that would be to provide both useful conformances as separate "views" to the underlying data - both the flat list of scalars and as a collection of rows/columns, even if we can't decide which (if any) should be the "default" conformance. This is exactly the kind of thing zero-cost abstractions are supposed to be useful for:

func someNestedCollectionAlgorithm<C: Collection>(_: C) 
  where C.Element: Collection, C.Element.Element == Int {
// ...
}

func someFlatCollectionAlgorithm<C: Collection>(_: C) 
  where C.Element == Int {
// ...
}

let matrix = Matrix4x4<Int>.identity

// View as a flat collection.
someFlatCollectionAlgorithm(matrix.scalars) // [1, 0, 0, 1]
// View as rows.
someNestedCollectionAlgorithm(matrix.rows) // [[1, 0], [0, 1]]
someFlatCollectionAlgorithm(matrix.rows[1]) // [0, 1]

I also think such a collection view would be useful for VectorN<T>. It's fair enough that Vector is not a Collection by default (we don't want most of that functionality showing up and polluting the namespace with non-SIMD operations), but you should at least be able to pass a vector to some existing algorithm written generically over any Sequence of Ints.

This design doesn't even include a way to copy the vector contents out to an Array. So if you need to use such an algorithm, you need to subscript each element individually and append it to an Array. Yuck:

let val = SIMDVector4<Int>(/* ... */)

// I hope you don't need a Vector32<T>...
var arr = Array<Int>()
arr.append(val[0])
arr.append(val[1])
arr.append(val[2])
arr.append(val[3])
someFlatCollectionAlgorithm(arr)

Some kind of collection view would solve both of these issues:

let val = SIMDVector4<Int>(/* ... */)

// Pass directly to a generic, non-SIMD algorithm.
someFlatCollectionAlgorithm(val.scalars)

// Copy contents to an Array.
let arr = Array(val.scalars)

// New way to spell "elementCount".
assert(val.scalars.count == 4)
3 Likes

This mental model is indeed a quite brilliant one and actually implemented in HW in the past too, see Sony’s Allegrex VFPU co-processor:

4.5 COP2 (VFPU)

The psp's VFPU (Vector Floating Point Unit) is a coprocessor that can perform quite a few useful operations. The main purpose of it is vector and matrix processing, but it also supports trigonemtric functions and other mathematical operations, conversions, and mathematical constants.
index

4.5.1 Registers

The VFPU has 128 single precision floating point (IEEE 754) registers (VFR0-VFR127), but they are arranged and accessed in various ways that make it very flexible. Many of the instructions for the VFPU support operations on:

  • a single register
  • a pair of registers
  • three registers
  • four regiters
  • 2x2 matrix
  • 3x3 matrix
  • 4x4 matrix
    And if that weren't enough, it can work with matrices in normal or transposed orders. The registers are grouped into 8 blocks of 16 registers each. This gives you enough room to work with 8 4x4 matrices, 8 3x3 matrices, 32 2x2 matrices. Or you can store up to 32 quad vectors, 40 triple vectors, 64 paired vectors, or 128 single values.”
    http://hitmen.c02.at/files/yapspd/psp_doc/chap4.html#sec4.5

This is something we will absolutely add in a follow-on proposal. Unlike the base types and operations, it can be done purely additively anytime in the future (by contrast, we’d really prefer to land the types and core operations in Swift 5 because of their interaction with existing simd types in the SDK on Apple platforms). We’ve deliberately stripped most additive changes like that out to keep the (already large) proposal as focused as possible.

3 Likes

This new version is vastly improved from the first one! Everything seems much more consistent and understandable.

The only thing which gives me pause is all of the _underscoring in SIMDVectorizable. I'd much rather we call it something like Vector4Storage than _Vector4. Also, is there a reason why _storage is underscored on the Vector types? If it can't be private, can we make it private(set) and remove the underscore?

Perhaps we should add an @annotation that basically says: "This should not show up in autocomplete".

Just to make sure I understand, would the following be how we allow Vector4<CGFloat>?

extension CGFloat : SIMDVectorizable {
#if GGFLOAT_IS_DOUBLE
    typealias _MaskElement = Double._MaskElement
    typealias _Vector2 = Double._Vector2
    typealias _Vector4 = Double._Vector4
    typealias _Vector8 = Double._Vector8
    typealias _Vector16 = Double._Vector16
    typealias _Vector32 = Double._Vector32
    typealias _Vector64 = Double._Vector64
#else
    typealias _MaskElement = Float._MaskElement
    typealias _Vector2 = Float._Vector2
    typealias _Vector4 = Float._Vector4
    typealias _Vector8 = Float._Vector8
    typealias _Vector16 = Float._Vector16
    typealias _Vector32 = Float._Vector32
    typealias _Vector64 = Float._Vector64
#endIf
}

EDIT: One more question. When numberical vectors land, Vector4<T> will adhere to both SIMDVector and that new protocol? Or will that protocol be based on SIMDVector?

1 Like

One quick question on this. Would adding things like these in the future require an update to SIMDVectorizable?

EDIT: If I wanted to make a Vector6 type, would I give it two storage vars, one of _Vector4 and one of _Vector2? (all hypothetical... I am just trying to understand the extension/growth model for the future)

+100. I would love to see this, and for it to be applied to (e.g.) the requirements of the ExpressibleBy protocols.

-Chris

5 Likes

+1000, I would have many use cases for this annotation. There are quite a few cases where I use a protocol for polymorphism but don't intend for its requirements to be used on the concrete type.

However, in the use cases I have while it is desirable to hide the members on concrete conforming types it would still be useful to have the member appear in autocomplete on existentials or generic types constrained to the protocol.

1 Like

Or when implementing the protocol. Or when writing a helper for implementing the protocol. Let's not design this annotation here; it deserves its own thread.

4 Likes

Agree, just trying to express support for it. :)

I'd like to know as well. Seems to me that whether math vectors are implemented with SIMD instructions ought to be up to compiler optimizations. My understanding is that a lot of the difficulty with "auto-vectorization" comes down a lot of stuff that we could reasonably account for given that we're discussing math ops only and not general purpose stuff... If I'm wrong, then I retract the statement (but am still curious as to Jon's question).

What we would do is use the existing T._Vector8 as the storage type for Vector8<T> and simply ignore the last two lanes (this is exactly how Vector3<T> works, by the way). To add, say Vector128<T>, we would extend SIMDVectorizable with a _Vector128 associatedtype that would default to a struct wrapping two _Vector64s.

I wonder if we should introduce a layer of indirection between SIMDVector and the SIMDVectorizable types - it doesn't need to be big, just a protocol with an associated type constrained to SIMDVectorizable.

That would allow CGFloat to participate.

1 Like

CGFloat can already participate by conforming to SIMDVectorizable. I'm not sure what this additional protocol would add.

Might it be a good idea to consider allowing the user to specify the convenience labels appropriate to their domain? The elements could alway be accessed by index as usual.

let v1= Vector3<Double, "xyz"> //3d cartesian
//v1.x, v1.y, v1.z are valid accessors

let v2 = Vector3<Double, "rθϕ">; //spherical
//v2.r, v2.θ, v1.ϕ are valid accessors

let v3 = Vector3<Double, "xyw"> //3d homogenous representation of 2d object

let v4 = Vector4<Double, "xyzt"> //cartesian plus time

let v5 = Vector4<Double, "rgba"> //color

I'm not sure if I think this is a good idea or not, but it's not something we'd be able to implement in the Swift 5 timeframe.

3 Likes

Would this use @dynamicMemberLookup from SE-0195?

Edit: Should the proposed x, y, z, w properties and init(x:y:z:w:) be removed?

1 Like

x, y, z, w is the 99.9% case; it's probably best to expose other labels via a wrapper if someone wants to use them.

6 Likes

We can use dynamicMemberLookup with some @semantics tricks to get ".xxyy" sorts of accessors to work in the future without huge overload sets, but that is a cleanly separable project for the future.

5 Likes

CGFloat in particular can, because it is maintained by people who know about the hidden protocol requirements. What about my own wrappers? e.g.

struct Distance {
  var meters: Double
  var kilometers: Double { return meters/1000 } // etc
}

How would I make this type vectorisable without using hidden requirements? An extra level of indirection allows any user code to point to an underlying stdlib type which conforms.

The "hidden" protocol requirements are public, so this wouldn't stop you, but FWIW they've been "unhidden" in the proposal, anyway. They are now SIMD2Storage, etc.

2 Likes