How to handle numerical data in Swift for a Matrix type

wigging · May 10, 2024, 1:42am

I created a matrix type for working with 2D numerical data. As shown in the example below, the underlying data is stored as a flat array of generic values. Those values could be Float, Double, or a complex type such as DSPComplex. I'm using functions from Accelerate to conduct various scalar-matrix, matrix-matrix, vector-matrix operations. The example demonstrates overloading * to perform matrix-matrix multiplication using the cblas_dgemm function.

import Accelerate

struct Matrix<T> {
    let rows: Int
    let columns: Int
    var values: [T]

    init(rows: Int, columns: Int, values: [T]) {
        self.rows = rows
        self.columns = columns
        self.values = values
    }

    init(_ content: [[T]]) {
        self.rows = content.count
        self.columns = content[0].count
        self.values = content.flatMap { $0 }
    }
    
    subscript(row: Int, column: Int) -> T {
        get { return values[(row * columns) + column] }
        set { values[(row * columns) + column] = newValue }
    }
}

extension Matrix {

    static func * (lhs: Matrix, rhs: Matrix) -> Matrix where T == Double {
        precondition(lhs.columns == rhs.rows, "Number of columns in left matrix must equal number of rows in right matrix")
        var a = lhs.values
        var b = rhs.values
        var c = [Double](repeating: 0.0, count: lhs.rows * rhs.columns)

        let m = lhs.rows     // rows in matrices A and C
        let n = rhs.columns  // columns in matrices B and C
        let k = lhs.columns  // columns in matrix A and rows in matrix B
        let alpha = 1.0
        let beta = 0.0

        // matrix multiplication where C ← αAB + βC
        cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, m, n, k, alpha, &a, k, &b, n, beta, &c, n)

        let mat = Matrix(rows: lhs.rows, columns: rhs.columns, values: c)
        return mat
    }
}

Here's how I would use this matrix type:

let a = Matrix([[1.0, 2.0, 3.0],
                [4.0, 5.0, 6.0]])

let b = Matrix([[1.0, 2.0, 3.0],
                [4.0, 5.0, 6.0],
                [7.0, 8.0, 9.0]])

let c = a * b

for i in 0..<c.rows {
    for j in 0..<c.columns {
        print(c[i, j], terminator: " ")
    }
    print()
}

Which prints the following:

30.0 36.0 42.0 
66.0 81.0 96.0

My question is about using a Swift Array as the underlying data storage for the Matrix. Should I use an Array or some other approach to store the numerical data? Arrays in Swift can handle a bunch of types (Int, String, Float, Bool, etc.) but I'm only interested in numerical data. As such, I would think that having a numerical array type that is a fixed size would be more performant and efficient than the standard Array type. I see that Swift has a ContiguousArray type but it seems to be for class objects and working with Objective-C objects like NSArray. There's also UnsafeMutablePointer but it would require memory management. Anyway, I'm just curious if there is a better way to store a collection numerical data in Swift for matrix and vector operations.

scanon · May 10, 2024, 1:53am

ContiguousArray is only meaningfully different from Array when you're working with arrays that might be bridged from Obj-C. For Swift arrays like the number types you're using here, Array is always contiguous.

If you're not ever resizing your matrix type, but just allocating once up front and then accessing elements or the base pointer, then a lot of Array functionality is wasted, but it's also not really costing you anything. The real question to ask to know if Array is right is: do you want your Matrix type to have value semantics and copy-on-write behavior? The one you have sketched out here does that--using Array gives it to you--and that's a pretty defensible choice, but it's not the only choice. There are reasonable design points for a Matrix type with reference semantics, and also for a Matrix to be a noncopyable type.

wigging · May 10, 2024, 2:52am

When you say "copy-on-write behavior", are you referring to something like (1) or (2) below? I'm familiar with the behavior in (2) from working with Python's NumPy package and Julia. In NumPy and Julia you have to explicitly copy something b = copy(a) otherwise doing b = a would make b a reference to a.

If you do b = a then b is a full copy of a such that if b is modified then it doesn't also modify a.
If you do b = a then b just refers to a such that a change in a or b is reflected in both.

tera · May 10, 2024, 6:18am

Your (1) and (2) is for "value" and "reference" semantics correspondingly. As for copy-on-write that would be the sub bullet points list of (1):

"b = a" makes copy right away (no COW).
In this model "b = a" takes O(n) time/space, and the subsequent mutating to either array like "b[0] = 42" is O(1).
"b = a" doesn't make a copy right away, the actual copy happens only when (and if) you change either a or b. That's COW.
In this model "b = a" is O(1), the very first "simple" mutation like "b[0] = 42" to either array is O(n), and the subsequent "simple" mutations are O(1).

Using Array is a reasonable choice. If you do have to work with lots of small matrices (say, float 3x3) then you may want to optimise that and store them inline (in this particular case that would be 36 bytes) instead of using a full blown array that requires dynamic memory allocations/deallocations, but otherwise just use an array.

wigging · May 10, 2024, 1:05pm

My use case would involve large MxN matrices with sizes that range from 100s of elements to 10,000 or more. For small matrix or vector operations, I would just use the SIMD types that Swift provides. Is there a limit to the size of a Swift Array? For example, if I have a 10,000x10,000 matrix then that would result in an array with 100,000,000 elements. I guess the amount of memory that is available would affect the size.

I looked at the Swift Tensorflow package. It looks like the ShapedArray in this package uses a buffer pointer for its data storage.

scanon · May 10, 2024, 2:28pm

Array is limited by two things:

what the platform's allocator will provide
the element count cannot exceed Int.max

There are some things that you can do with e.g. mapping files and vm tricks for huge allocations that you cannot do with Array. But in practice these are rarely a very good idea and are unlikely to matter on 64b systems.

With the Matrix you have sketch above, var b = a semantically makes a copy of a, but the actual backing array would not be copied until you made a mutation to one of a or b. This has some good points:

you don't get copies that you don't need
no spooky-action-at-a-distance: the value of a can only change via accesses through a

but also:

what look like cheap operations (e.g. updating an isolated value) can become expensive operations (because it forces a complete copy of the backing array if there's another matrix backed by the same storage)
if you aren't somewhat careful about how you write your code, you can end up with a lot of copies without realizing it; if you matrices are large, that can be a problem.

If you used UBP without implementing any such logic yourself, then you would get the Julia/NumPy model: var b = a does not create a semantic copy; any update to one is reflected in the other. This makes a different set of tradeoffs. The main downside is spooky-action-at-a-distance, the main upside is that you avoid "unnecessary" copies.

The other model that I think is pretty defensible is to make the matrix type itself non-copyable, so that you must explicitly invoke a copy operation. This simply prevents you from writing var b = a at all. There's a lot to be said for this approach, but it's definitely less familiar to people coming from Julia/NumPy/etc and depends on features that are just now landing in Swift, so there's less documentation and examples available to help you adopt it (and also less in the way of established patterns for how other people should use it).

wigging · May 10, 2024, 7:00pm

It sounds like I should stick with using an Array for the data storage. However, I would like to avoid unnecessary copies especially with large matrices. You mention making the matrix struct itself non-copyable with some new Swift features. Are you referring to using ~Copyable? I don't think ~Copyable supports generics so it wouldn't work with the matrix type I have defined here.

tera · May 10, 2024, 7:14pm

Then I'd recommend to not have * + operations, but *= +=, etc.
cblas_dgemm seems to support this.

Your matrix type with ~Copyable on it compiles fine for me... I didn't try using it.

scanon · May 10, 2024, 7:16pm

Generic types can be ~Copyable.

wadetregaskis · May 10, 2024, 7:21pm

But Array doesn't support ~Copyable elements yet, I thought…? And Array itself is always copyable.

There was some discussion in other threads about the challenges of any RandomAccessCollection supporting ~Copyable because that protocol implicitly assumes & requires copyability of the elements.

scanon · May 10, 2024, 7:48pm

A non-copyable struct can have copyable members, and nothing in @wigging's example would involve non-copyable element types (nor would that ever be likely to happen in any matrix package, other than maybe something like a matrix-of-matrices, but there's better ways to model that).

(There are real downsides to living on the bleeding edge and trying to adopt ~Copyable for this, but none of these are a problem)

wigging · May 10, 2024, 8:19pm

I made the Matrix struct non-copyable as shown below. I also had to use borrowing for the multiplication function in the extension. I'm not familiar with this ~Copyable approach so I guess this is how it should be implemented? However, since I'm just accessing the matrix values within the multiplication function, I don't think I'm gaining anything here because even without ~Copyable the underlying values are not copied.

struct Matrix<T>: ~Copyable {
    
    let rows: Int
    let columns: Int
    var values: [T]
    
    // ...
}

extension Matrix {

    static func * (lhs: borrowing Matrix, rhs: borrowing Matrix) -> Matrix where T == Double {
        precondition(lhs.columns == rhs.rows, "Number of columns in left matrix must equal number of rows in right matrix")
        
        var a = lhs.values
        var b = rhs.values
        var c = [Double](repeating: 0.0, count: lhs.rows * rhs.columns)

        // ...
    }
}

Something I noticed regarding the contents of the static function, I can write it using var for a and b along with &a and &b for the cblas_dgemm function. Or I can use let for a and b then pass a and b to cblas_dgemm. Seems like using let is the more correct approach here?

// Using var for a and b, and using &a and &b for cblas_dgemm

static func * (lhs: borrowing Matrix, rhs: borrowing Matrix) -> Matrix where T == Double {
    precondition(lhs.columns == rhs.rows, "Number of columns in left matrix must equal number of rows in right matrix")
    
    var a = lhs.values
    var b = rhs.values
    var c = [Double](repeating: 0.0, count: lhs.rows * rhs.columns)

    // ...

    cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, m, n, k, alpha, &a, k, &b, n, beta, &c, n)

    let mat = Matrix(rows: lhs.rows, columns: rhs.columns, values: c)
    return mat
}

// Using let for a and b, and using a and b for cblas_dgemm

static func * (lhs: borrowing Matrix, rhs: borrowing Matrix) -> Matrix where T == Double {
    precondition(lhs.columns == rhs.rows, "Number of columns in left matrix must equal number of rows in right matrix")
    
    let a = lhs.values
    let b = rhs.values
    var c = [Double](repeating: 0.0, count: lhs.rows * rhs.columns)

    // ...

    // matrix multiplication where C ← αAB + βC
    cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, m, n, k, alpha, a, k, b, n, beta, &c, n)

    let mat = Matrix(rows: lhs.rows, columns: rhs.columns, values: c)
    return mat
}

I also want to mention that I'm not trying to use this on Linux and I don't care about supporting old versions of Swift. My goal is to just have something that works well on modern Apple devices with new versions of Swift.

wigging · June 13, 2024, 1:23pm

This example in the Accelerate documentation defines a Matrix structure that uses a buffer for its underlying data storage.

@scanon and @tera What do you think about this approach where a mutable buffer is used for the matrix data instead of an Array? It looks like using a buffer would make it easier to work with some Accelerate functions, especially with the BLAS and LAPACK functions. I tried to find other examples of using a buffer for a Swift matrix but everything I have found uses an array. So that makes me think that an array may be the better approach for underlying matrix data.

scanon · June 13, 2024, 1:29pm

Small wrappers around the functions in question should abstract the distinction away for the user entirely, so it's really a question of which option makes it easier to build the semantics that you want the type to have. If you want to have copy-on-write value types, then Array is a reasonable choice. If you want to implement some other scheme, managing a [mutable] buffer yourself will likely be the better option.

wigging · June 13, 2024, 1:44pm

I do like the idea of copy-on-write and not having to deal with memory management regarding the array approach. However, for the matrix type discussed in this post, I only want to do numeric operations and I don't need to resize the matrix. So I feel like all of the non-numeric features of an array would be wasted here. With this in mind, would an array still be more performant than using the buffer approach?

scanon · June 13, 2024, 1:47pm

Neither should be more performant than the other, since either way all of the actual computation happens via the same implementation. The distinction is semantic: are you allowed to have multiple references to the same matrix data? How and when does copying occur? The answers to these questions determine the simplest representation.

wigging · June 13, 2024, 2:22pm

As a NumPy user, I have to explicitly copy a NumPy array if I want a new and independent copy of that array ("array" in this context is n-dimensional array or matrix). NumPy uses an underlying data store that is a contiguous block of memory which they refer to as a data buffer. So in this regard it sounds like the Swift Matrix would need to use the buffer approach to have a similar behavior as a NumPy array. Below is more info about what NumPy is doing under-the-hood if you want to learn more.

https://numpy.org/doc/stable/user/basics.copies.html

https://numpy.org/doc/stable/dev/internals.html#numpy-internals

scanon · June 13, 2024, 3:36pm

That's right. If you're trying to replicate NumPy's semantics, you would not use Array.