Balanced binary reduction
By: Justin Meiners
Date: June 16, 2021
Introduction
Swift's reduce
can be used to accumulate elements in a sequence.
For example:
[1.0, 2.0, 3.0, 4.0].reduce(0.0, +)
If we visualize the expression tree for this operation, it's linear.
You can think of it as a binary tree which is completely unbalanced.
3.06.010.0
/ / / /
1.0 2.0 3.0 4.0
Assuming the binary operation is associative then a program could choose which
order to evaluate arguments in.
For example, we can image a function balancedReduce
which rearranges
the order of evaluation to the following:
10.0
/ \
/ \
3.0 7.0
/ \ / \
1.0 2.0 3.0 4.0
We haven't changed the result at all, just the order it's evaluated in.
Even though the two forms are mathematically equivalent,
there are good computational reasons why one might want to do this.
Here are just a few to illustrate it's general applicability:

Adding up many floating point numbers (as in this example).
Adding small magnitudes to large magnitudes introduces numerical error.
Given a long enough list of values of similar magnitudes
this is guaranteed to happen.
(Accelerate has a functionvDSP.sum
inpart
to address this problem.)A balanced evaluation of such values will
tend to add similarly sized values. (Intermediate results which are
the result of adding equal numbers of values together). 
Merge sort can be written by applying the binary operation
of "merge sorted list" to a sequence of lists:[ [a1], [a2], ... [an] ]
If you aply this operation using
reduce
it is essentially insertion sort,
one element is inserted into the growing list at a time.If you use a
balancedReduce
it will only merge lists
that are roughly the same size, giving a proper
O(n log(n))
sort. 
It can be used to solve the problem of finding the smallest
element, and second smallest element, in a sequence
in the optimal number of comparisons.
(Described in 5.3.3 of Volume 3 in "Art of Computer Programming").
Proposed implementation
This capability can be added with an extension to Sequence
:
extension Sequence {
func balancedReduce(
_ associativeOperation: (Element, Element) > Element
) > Element? {
var counter: [Element?] = []
// counter will never be larger than 64
counter.reserveCapacity(20)
for x in self {
if let carry = BinaryCounter.add(x,
to: &counter,
with: associativeOperation) {
counter.append(carry)
}
}
return BinaryCounter.reduce(counter: &counter,
with: associativeOperation)
}
}
This relies on two algorithms:
struct BinaryCounter {
private init() {}
static func add<C: MutableCollection, T>(
_ x: T,
to counter: inout C,
with op: (T, T) > T
) > T? where C.Element == T? {
var i = counter.startIndex
let end = counter.endIndex
var carry = x
while i != end {
if let y = counter[i] {
// must be in this order since op is not commutative.
// things further to the end of the counter were inserted earlier
carry = op(y, carry)
counter[i] = nil
i = counter.index(after: i)
} else {
counter[i] = carry
return nil
}
}
return carry
}
static func reduce<C: MutableCollection, T>(
counter: inout C,
with op: (T, T) > T
) > T? where C.Element == T? {
var i = counter.startIndex
let end = counter.endIndex
// find first nonzero
while i != end && counter[i] == nil {
i = counter.index(after: i)
}
if i == end {
return nil
}
var x = counter[i]!
i = counter.index(after: i)
while i != end {
if let y = counter[i] {
// must be in this order since op is not commutative
// things further to the end of the counter were inserted earlier
x = op(y, x)
}
i = counter.index(after: i)
}
return x
}
}
Notice that this transformation is very efficient.
It does require additional memory for the counter,
but its size is roughly log(n)
which qualifies as inplace.
Example usage
Adding up numbers:
[1.0, 2.0, 3.0, 5.0].balancedReduce(+)
Merge sort:
func mergeSorted<T>(
_ a: [T],
_ b: [T]
) > [T] where T: Comparable {
var i = a.startIndex
var j = b.startIndex
var out: [T] = []
while i != a.endIndex && j != b.endIndex {
if b[j] < a[i] {
out.append(b[j])
j = a.index(after: j)
} else {
out.append(a[i])
i = a.index(after: i)
}
}
if j != b.endIndex {
out.append(contentsOf: b.suffix(from: j))
} else if i != a.endIndex {
out.append(contentsOf: a.suffix(from: i))
}
return out
}
[1.0, 2.0, 3.0, 5.0].map({ $0 }).balancedReduce(mergeSorted)
References
This algorithm can be found in Alex Stepanov's (creator of C++ STL) book "Elements of Programming"
chapter 11.2.
The Haskell package Tree Fold implements it.
There is a related notion in C++.
std::accumulate
which does a traditional left fold,
std::reduce
is allowed to associate its arguments.
For technical reasons, it additionally requires the operation is commutative,
which I do not think is necessary for this.
It appears std::reduce
is primarily to facilitate parallel
computation, but it demonstrates two similar operations
whose results only differ on their association order.
Questions/Concerns
If this was such a need shouldn't this be a popular library?
It's one of those things that solves problems if you have it available,
but most probably aren't going to look for it, due to awareness of the issues.
As far as I know, there is no facilities in swift to add large lists
of floating point numbers in a stable way (besides sorting).
To implement it oneself would require reimplementing something similar.
(iOS and mac have vDSP.sum
in Accelerate).
It's too theoretical
All of the theory is hidden inside.
It's just one function which works like fold
/reduce
.
It's hard to misuse, and arguably produces
better and less suprising results
than reduce
for most operations.
Algebraic requirements like associativity on operations are used throughout the stdlib,
for example sort
relies on the comparison being a StrictWeakOrdering
.
The name is bad
Looking for recommendations.
reduce
may be incorrect because reduce allows you do specify a different
accumulation type than the elements in the sequence.
In this algorithm the operation needs to take two inputs
of the same type, so it really is a fold
, but fold
is
probably not a friendly term.
Perhaps accumulate
?
Could binary counter be a struct?
A struct couldn't have operation
as a member, unless it was escaping.
So, I don't think it would help.
Does it need to store optionals?
It's possible to instead provide a zero
parameter
which indicates a slot in the counter is available.
Conceptually this should be a value where op(x, zero) = x
for all
x
, although it's not required.
I chose nil
because the zero parameter is difficult to explain,
and is confusing is because it wouldn't be the same as reduce
's initialResult
parameter.