Generic SIMD function produces non-SIMD code, unless you explicitly constraint it

A while back I asked how to do something similar, and @scanon recommended writing a widen function like this:

private func widen(_ x: SIMD8<UInt8>) -> SIMD8<UInt16> {
    SIMD8<UInt16>(truncatingIfNeeded: x)
}

Then you can interleave vectors like this:

(widen(a) &<< 8) | widen(b)

I believe this also produces efficient machine instructions.

2 Likes