A Possible Vision for Macros in Swift

regexident · October 17, 2022, 11:35pm

My usecase for macros in Swift is similar to that described by @Andrew. And similar to his example I would have used pattern-based macros à la Rust. Especially when writing unit tests having macros can be a god-send.

To give a motivational example:

As one of the maintainers of the popular Surge (a framework for high-performance computation, powered by Apple's Accelerate framework) and the main driver behind a major cleanup and refactor around 3 years ago I found the redundancy of code in the project and the need to keep all its redundant consistent and in sync a major burden for maintenance and —let's face it— motivation to keep working on it. It's just no fun at all. Working on Surge has been everything but fun.

So at some point I decided to try to start from scratch (dubbed SurgeNeue), this time making excessive use of code-generation. I however had to quickly realize that Sourcery, the only promising tool I could find back then, was inadequate for the task. And .gyb would have violated the whole point of making working on Surge fun again.

Surge already exposed over 300 functions and SurgeNeue would further increase this number by introducing additional optimized variants that would allow for writing into pre-allocated buffers.

What I would have needed to make this feasible was proper macros (the good ones à la Rust, not the shitty ones from C). The quickly abandoned experimental project can be found here.

The idea behind SurgeNeue was to try to find common patterns in the functions provided by Accelerate and to write a template for each such function pattern and use that to generate the annoying and utterly redundant boilerplate.

As such one of the patterns in Accelerate is unary functions, which tend to be in the shape of:

// parallel sqrt(x)
func vDSP_vsqD(
    _ a: UnsafePointer<Double>,
    _ ia: vDSP_Stride,
    _ c: UnsafeMutablePointer<Double>,
    _ ic: vDSP_Stride,
    _ n: vDSP_Length
)

with a and ia being the source buffer and stride, c and ic being the mutable destination buffer and stride and n being the number of elements.

I'll be calling this pattern an "external unary" function. External since the destination buffer's type is provided by the caller. Mutating, since it writes into a pre-allocated buffer. And unary since sqrt(x) is a unary function.

Turns out once you have such an "external mutating unary" function (pseudo-code):

External mutating unary:

// Read from `src` and writes into pre-allocated `dst` of possibly different type:
let src: Array<Double> = [1, 2, 3]
var dst: ContiguousArray<Double> = .init()
dst.reserveCapacity(src.count)
squareRoot(src, into: &dst)

func squareRoot<Src, Dst>(_ src: Src, into dst: inout Dst) {
    assert(
        src.stride == 1 && dst.stride == 1,
        "sqrt doesn't support stride values other than 1"
    )

    assert(
        dst.count == 1,
        "destination not memory of a single scalar"
    )

    let stride: vDSP_Stride = numericCast(src.stride)
    let count: vDSP_Length = numericCast(src.count)
    vDSP_vsqD(src.pointer, stride, dst.pointer, stride, count)
}

… you can easily provide convenience variants for it by just wrapping it (pseudo-code):

External unary:

// Read from `src` and writes into newly allocated `dst`:
let src: Array<Double> = [1, 2, 3]
let dst = squareRoot(src, as: ContiguousArray<Double>.self)

func squareRoot<Src, Dst>(_ src: Src, as type: Dst.Type = Dst.self) -> Dst {
    var dst = Dst()
    return squareRoot(src, &dst)
}

Internal mutating unary:

// Read from `src` and writes into pre-allocated `dst` of same type:
var src: Array<Double> = [1, 2, 3]
squareRoot(&src)

func squareRoot<Src>(_ src: inout Src) {
    return squareRoot(src, src)
}

Internal unary:

// Read from `src` and writes into newly allocated `dst` of same type:
let src: Array<Double> = [1, 2, 3]
let dst = squareRoot(src)

func squareRoot<Src>(_ src: Src) -> Src {
    var dst = src
    return squareRoot(src, &dst)
}

The only parts that are different (between implementation arithmetic functions) for each of these are the implementation of the "external unary" variant, the functions' names and their corresponding documentation comments. Otherwise whether you're implementing parallel sqrt() or parallel abs() makes no difference: the code structure is always the same. Like a perfect stencil. Perfect for code-gen via macros.

(There are some differences between functions from vecLib and vDSP, but let's ignore that for sake of simplicity here.)

As such with macros at my disposal I could have just written this instead:

/// Calculates the element-wise square root of `src`, writing the results into `dst`.
func squareRoot<Src, Dst>(_ src: Src, into dst: inout Dst) {
    // actual implementation
}

/// Calculates the element-wise square root of `src`, returning the results as a `Dst`.
#generateExternalUnary(squareRoot)

/// Calculates the element-wise square root of `src`, writing the results into itself.
#generateInternalMutatingUnary(squareRoot)

/// Calculates the element-wise square root of `src`, returning the results as modified copy.
#generateInternalUnary(squareRoot)

… and have Swift generate the repetitive code from above for me.

The benefit would be even greater for the test suite that I would want to write for SurgeNeue, if I could write this.

#testUnary(squareRoot)

… instead of:

func test_externalMutating_squareRoot() {
    let values = [1, 2, 3, 4, 5]
    var actual = ContiguousArray<Double>()
    actual.reserveCapacity(values.count)
    squareRoot(values, &actual)
    let expected = serial_squareRoot(values)
    XCTAssertEqual(actual, expected)
}

func test_external_squareRoot() {
    let values = [1, 2, 3, 4, 5]
    let actual = squareRoot(values, as: ContiguousArray<Double>.self)
    let expected = serial_squareRoot(values)
    XCTAssertEqual(actual, expected)
}

func test_internalMutating_squareRoot() {
    var actual = [1, 2, 3, 4, 5]
    var actual = values
    squareRoot(&actual)
    let expected = serial_squareRoot(values)
    XCTAssertEqual(actual, expected)
}

func test_internal_squareRoot() {
    let values = [1, 2, 3, 4, 5]
    let actual = squareRoot(values)
    let expected = serial_squareRoot(values)
    XCTAssertEqual(actual, expected)
}

Especially since such tests would commonly be much more involved than this example already is.

I might have managed to plow through the repetitive implementation somehow, but writing tests for it would have broken me for sure. Imagine writing 4 such highly repetitive test functions each, for each of over 100 arithmetic functions in SurgeNeue.