Use a protocol to reduce code and documentation for numeric functions

I'm working on a linear algebra package that relies on Accelerate, BLAS, and LAPACK for performing vector and matrix operations. Consequently, most of the operations use functions that are only applicable to a certain value type such as Float or Double. This makes for a lot of redundant code and documentation because I need to provide documentation for each function for each specific value type. The example below demonstrates this.

Example without a protocol

Here is a generic vector structure that uses a flat array as the underlying value storage.

// Vector.swift

struct Vector<T> {

    let size: Int
    var values: [T]

    init(_ values: [T]) {
        self.size = values.count
        self.values = values
    }

    init(like vector: Self) {
        self.size = vector.size
        self.values = vector.values
    }
}

The code below uses the vector struct to perform element-wise vector addition for Int, Float, and Double values. The Accelerate vDSP.add function only supports Float and Double values so a loop is used for vectors that contain Int values. Basically, the same function has been defined for each value type that I want to support. The documentation comments for each function are also similar. This is a lot of duplicate code which generates a lot of redundant documentation. I would like to have just one function that handles the different value types and therefore one function with documentation comments.

// Addition.swift

import Accelerate

/// Add two vectors using integer values.
/// - Parameters:
///   - a: The first vector.
///   - b: The second vector.
/// - Returns: The result vector.
func add(_ a: Vector<Int>, _ b: Vector<Int>) -> Vector<Int> {
    var result = [Int](repeating: 0, count: a.size)
    for i in 0..<a.size {
        result[i] = a.values[i] + b.values[i]
    }
    return Vector(result)
}

/// Add two vectors using single-precision values.
/// - Parameters:
///   - a: The first vector.
///   - b: The second vector.
/// - Returns: The result vector.
func add(_ a: Vector<Float>, _ b: Vector<Float>) -> Vector<Float> {
    var vec = Vector(like: a)
    vDSP.add(a.values, b.values, result: &vec.values)
    return vec
}

/// Add two vectors using double-precision values.
/// - Parameters:
///   - a: The first vector.
///   - b: The second vector.
/// - Returns: The result vector.
func add(_ a: Vector<Double>, _ b: Vector<Double>) -> Vector<Double> {
    var vec = Vector(like: a)
    vDSP.add(a.values, b.values, result: &vec.values)
    return vec
}

Usage of this code is shown below along with the print output.

let a = Vector([1, 2, 3])  // integer
let b = Vector([9, 3, 4])
let c = add(a, b)
print(c)

let a1 = Vector<Float>([1, 2, 3])  // float
let b1 = Vector<Float>([9, 3, 4])
let c1 = add(a1, b1)
print(c1)

let a2 = Vector([1.9, 2, 3])  // double
let b2 = Vector([9, 3.8, 4])
let c2 = add(a2, b2)
print(c2)
Vector<Int>(size: 3, values: [10, 5, 7])
Vector<Float>(size: 3, values: [10.0, 5.0, 7.0])
Vector<Double>(size: 3, values: [10.9, 5.8, 7.0])

Example with protocol

The only solution (that I know of) to reduce redundant code and documentation is to use a protocol along with generic values. Using the same vector struct defined above, I can define a scalar protocol as follows:

// Scalar.swift

protocol Scalar {
    static func add(_ a: Vector<Self>, _ b: Vector<Self>) -> Vector<Self>
}

Using this protocol I can extend the Int, Float, and Double types with the add function for that particular type. See below for the Float and Double extensions.

// Float+Scalar.swift

import Accelerate

extension Float: Scalar {

    static func add(_ a: Vector<Self>, _ b: Vector<Self>) -> Vector<Self> {
        var vec = Vector(like: a)
        vDSP.add(a.values, b.values, result: &vec.values)
        return vec
    }
}
// Double+Scalar.swift

import Accelerate

extension Double: Scalar {

    static func add(_ a: Vector<Self>, _ b: Vector<Self>) -> Vector<Self> {
        var vec = Vector(like: a)
        vDSP.add(a.values, b.values, result: &vec.values)
        return vec
    }
}

This allows me to have one public add function and therefore one set of documentation comments which applies to all the supported types. See the add function below.

// Addition.swift

/// Element-wise addition of two vectors.
/// - Parameters:
///   - a: The first vector.
///   - b: The second vector.
/// - Returns: The result of adding two vectors.
func add<T: Scalar>(_ a: Vector<T>, _ b: Vector<T>) -> Vector<T> {
    T.add(a, b)
}

Usage of the above code and the print output is shown below.

let a = Vector<Float>([1, 2, 3, 4])
let b = Vector<Float>([4, 8, 5, 10])
let c = add(a, b)
print(c)

let a1 = Vector([1, 2, 3, 4.0])
let b1 = Vector<([4, 8, 5, 10.0])
let c1 = add(a1, b1)
print(c1)
Vector<Float>(size: 4, values: [5.0, 10.0, 8.0, 14.0])
Vector<Double>(size: 4, values: [5.0, 10.0, 8.0, 14.0])

This approach requires more internal code but the amount of public functions and documentation is greatly reduced.

Questions

I have several questions related to all of this:

  1. Is using a protocol a good approach to reduce duplicate public code and documentation for this situation?
  2. If I don't use the protocol approach, does DocC provide any features to produce one set of documentation for functions that handle different value types?
  3. Are there any performance issues that I should be aware of when using the protocol approach compared to the non-protocol approach?
  4. Is there a better name other than Scalar that I can use for the protocol implemented above? I thought about calling it Numeric but Swift already has a Numeric protocol.
  5. Does the generic element in the vector struct need to conform to the protocol? For example, should I use struct Vector<T: Scalar> { ... } instead of struct Vector<T> { ... }?

I think generics with protocol requirements is the way to go. But just to expand on the list of alternatives

I guess there exists two more approaches, but falling in under the category of ā€œmeta programmingā€:

  1. Macros
  2. GYB

Both solutions are quite different. Macros makes the package using them build much slower due to Swift Syntax being a heavy dependency. But is a more neat and Swifty solution.

GYB is more ancient tech, not all Swifty (it is Python based), but it has some nice advantages. It will not increase build time at all really. You GYB once and it will have generated the swift files for you which you git include.

1 Like

I have never heard of GYB so I'll read up on it using the link you shared. As for macros, I haven't learned much about them since they are very new in Swift. Do you know of any examples that might guide me to applying them to this type of problem?

GYB is basically a simpler version of Sourcery used by Apple.

Iā€™m bad at Swift macros (but decent at Rust macros), and maybe they fit badly here. I think you might be able to do the code gen you want with Swift macros, but Im unsure. In Rust you have two different kind of macros: macro_rules! and procmacros. Swift macros are more like Rusts procmacros - which are more
Powerful, but also more ā€œend user facingā€ if you will. macro_rules! is more a meta programming tool - which I imagine is what you want. You wanna author and vendor the repetitive code - with max performance and with possibility of generating code doc. That is typically what I - in Rust - would use macro_rules! for. But I canā€™t imagine how using procmacrosā€¦ ie not how using Swift macros. So I think I ought to amend my two suggestions:

  • GYB
  • Sourcery

EDIT:
I still think protocols and generics is the way to go. A minor advantage with metaprogramming is one can write slightly better Documentation IMO. And the function signature use concrete types only which might make them slightly simpler to grasp. But users of linear algebra packages are typically quite bright and ought to have to issues with genericsā€¦

(Apologies for the Rust talk, Iā€™m mostly writing Rust nowadays. Letā€™s see if anyone great at Swift macros and Rust macros can give some pointers to how how would solve this using Swift macros)

Btw I think it is worth pointing out that generics get specializations of each possible type that fit the generic types generated at compile time - if Im not terribly mistaken - meaning that generics should not incur any runtime performance penalty.

That's compiler's choice, I believe.

Though I think it should generally happen when the caller and callee are in the same module. But when the invocation is cross-module, you may need @inlinable / @alwaysEmitIntoClient ... sort of things. The options passed to the Swift compiler matter too.

2 Likes

Can you elaborate on this? What is considered a module in Swift? Where in my example code that I posted above should I apply the @inlinable and @alwaysEmitIntoClient attributes?

What is considered a module in Swift?

See The Swift Programming Language > Access Control > Modules, Source Files, and Packages. The key thing is that each bit of code within a module can ā€˜seeā€™ any other bit of the code within the module, at least conceptually. In contrast, code in another module can only ā€˜seeā€™ the interface to your module.

Where in my example code that I posted above should I apply the @inlinable ā€¦?

Thatā€™s a hard question to answer. Generally you wonā€™t need @alwaysEmitIntoClient because thatā€™s useful when dealing with backward compatibility stuff. Rather, youā€™ll likely need @inlinable and @usableFromInline. Thereā€™s a bunch of general info about those in The Swift Programming Language > Attributes.

To make sense of that you have to think like the compiler. Specifically, if the compiler is building a client module thatā€™s separate from your module, and thus canā€™t see all the code in your module, what does it need to see to order to inline, and so effectively optimise, the client code?

Share and Enjoy

Quinn ā€œThe Eskimo!ā€ @ DTS @ Apple

1 Like

Thank you for sharing the links. The documentation for inlinable and usableFromInline is lacking code examples of how to actually use the attributes. I'm very much a learn-by-example kind of person. Are there any code examples of these attributes that you recommend I look at to get a better understanding of how to apply them?