The result matrix could be created once (outside) and used many times. It's internal dimensions could be larger than needed so it could be used with different sized matrices:
BTW, you could use a*b syntax and store the result into a global or a task local variable to avoid its re-allocation: task local will be better than global if you plan to use * across tasks/threads.
Ah, I see. I'll keep this in mind for the package I'm working on. I can always add more features to support things like this once I have the basics implemented.
Usually matrix multiplications are happening in the context of some broader algorithm, so that you actually have something like:
for iteration in whatever {
// do something that modifies a or b or both
let c = a*b
// use c for something
}
It is often desirable to hoist the allocation of C out of this loop, so that only computation is required internally¹:
var c = Matrix( ... )
for iteration in whatever {
// do something that modifies a or b or both
gemm(a, b, &c)
// use c for something
}
¹ actually, many BLAS implementations will allocate internal to GEMM; a better BLAS API would have allowed the caller to provide workspace to avoid that.
This is rather hard to optimize away in most languages, because a*b is usually defined as a thing that allocates its result, so hoisting requires a transformation that moves that allocation out of the function that implements a*b, which is usually in some other library/module and not inlined into the containing loop. In some languages you solve this by inlining a wrapper that does allocation and then calls a routine that actually looks like gemm(a, b, &c), so that the compiler has the information necessary to replace the allocation and assignment with "just take a reference to the existing c" (depending on the language and type semantics, you might also need to have some mechanism to tell the compiler that the c being replaced is not referenced by anything else--this is where Swift's newer noncopyable features become relevant).
Because you can generally go faster if you "repack" pieces of one or more of A, B, or C into tiles that are laid out optimally for the inner compute kernel (in particular, you repack so that all of the values that you need to touch in the inner compute kernel are contiguous in memory, which maximizes your cache and TLB reach; this gains you more than enough performance to pay for the cost of allocating space to repack into and doing the repacking).
Intel has experimented with explicit pack API and a gemm variant that takes the pack storage in MKL so that it can be reused across calls, which is another solution to the problem, with pretty impressive uplift when one of the matrix dimensions is small. (The real moral of the story here is that GEMMs of reasonable size are much too complex of an operation to attempt to shoehorn it into an operator, and deserve to be function calls, otherwise you'll doom most of your use sites to limited performance).
FloatingPoint refines Numeric which refines ExpressibleByIntegerLiteral, so you can initialize any floating point type with an integer literal, but no actual type conversion is taking place.
Regarding the definition of the matrix type I demonstrated in the original post. I noticed that I defined the matrix operator functions in extensions as shown below:
import Accelerate
struct Matrix<T> {
let rows: Int
let columns: Int
var values: [T]
init(_ content: [[T]]) {
self.rows = content.count
self.columns = content[0].count
self.values = content.flatMap { $0 }
}
subscript(row: Int, column: Int) -> T {
get { return values[(row * columns) + column] }
set { values[(row * columns) + column] = newValue }
}
}
extension Matrix {
static func * (lhs: Matrix, rhs: Matrix) -> Matrix where T == Double {
// ...
}
static func + (lhs: Matrix, rhs: Matrix) -> Matrix where T == Float {
// ...
}
}
I found that it's not necessary to define these functions in an extension. I can just define them as external functions like shown here:
These functions don't actually extend the matrix type, so I see no reason to define them in an extension. So is it fine to not define these in an extension or am I missing something here and they need to go in an extension?
Both will work. Defining them as static methods in an extension is definitely the prevailing style, however.
(It wasn't possible to define operators as static methods until Swift 3; the global function style was the only option. It had some significant downsides, so we added the static method form. That document provides context for the change and highlights reasons why you should generally prefer the static method style.)