A while back I asked how to do something similar, and @scanon recommended writing a widen function like this:
private func widen(_ x: SIMD8<UInt8>) -> SIMD8<UInt16> {
SIMD8<UInt16>(truncatingIfNeeded: x)
}
Then you can interleave vectors like this:
(widen(a) &<< 8) | widen(b)
I believe this also produces efficient machine instructions.