Motivation
Expressions of the form a*b - c*d
or a*b + c*d
are extremely common and notorious sources of cancellation error in floating-point arithmetic. Some well-known examples include 2x2 matrix determinants and the discriminant term in the quadratic formula. It's quite easy to construct ill-posed examples (for determinants, when the rows/columns are nearly collinear; for discriminants when the roots of the quadratic are close together) for which essentially all accuracy is lost if a naive computation via two multiplies and an add or a multiply and an FMA is used.
Kahan's lecture note goes into excessive detail on these motivating examples, for anyone interested in a more complete treatment of the problem.
Proposed Solution
There is a simple algorithm from Kahan that eliminates the problem entirely with almost zero performance cost. Kahan was particularly interested in 2x2 determinants for the purposes of finding singular values, through he wrote his note in terms of quadratics, but the algorithms is more generally useful; Matt Pharr calls it "difference of products," which I find to be a pretty good name (even though it's also applicable to sums of products). In pseudocode, the algorithm is as follows:
// given a, b, c, d, compute ab - cd
let (head, tail) = Augmented.product(-c, d)
let result = head.addingProduct(a, b) + tail
Instead of two multiplies and an add, this requires two FMAs and a multiply, so the performance impact of adopting it is negligible on modern targets, and it completely solves the problem; instead of essentially arbitrary relative error, this achieves an error bound of 1.5ulp for binary floating-point types.Âą
I propose to make this algorithm available in the Augmented enum of RealModule, under the binding: Rough sketch of fast difference-of-products. by stephentyrone · Pull Request #329 · apple/swift-numerics · GitHub. As a strawman API for development purposes, I have bound it as:
extension Augmented {
/// ab - cd
static func fastDifferenceOfProducts<T: FloatingPoint>(
_ a: T, _ b: T, _ c: T, _d: T
) -> T
}
Alternatives
- Naming: I generally like Matt Pharr's name vs "determinant" or similar, because it suggests that it can be used for any difference of products, but I'm open to other suggestions. In particular, "difference of products" suggests the argument ordering (a, b, c, d) for (ab - cd), but a "determinant" name might suggest (ad - bc), and other names would suggest something else. This is primarily of use as a building block for higher-level numerical algorithms, so I do not expect "normal" users to be faced with it often, but it's still worth getting right.
- Naming (pt 2): Why "fast difference of products" instead of just "difference of products"? I have in mind leaving space for a more expensive sub-ulp accurate (or correctly-rounded) implementation, or an implementation that produces both head and tail. But maybe those would best be placed under a different name, and this should have the shorter name.
- This algorithm does not handle overflow/underflow. That's OK, neither does
Augmented.product
. But it's worth keeping in mind with regard to API design. - Should there also be an API for
ab + cd
(or should that be the API)? There isn't one for head-taila-b
; users are expected to writesum(a, -b)
, but maybe that should be revisited. - Maybe
Augmented
should be reserved for algorithms that have head-tail results or parameters, and algorithms like this that merely use augmented arithmetic belong elsewhere.
Âą this is not obvious, and for a decade or so after publication people only remarked that the error was surprisingly good most of the time. Claude-Pierre Jeannerod, Nicolas Louvet, and Jean-Michel Muller published the clearest proof that I'm aware of.