For performance, nil or throws?

michelf · July 15, 2022, 2:47pm

For good measure, you should also test calling your repeatOnRandomPairs function without doing anything with the result. If you call random in there it's likely to be more costly than the couple of arithmetic operations you're performing inside the closure.

Also, note that guard (or catch) creates at least one branch. If that's representative of what you'll be doing then it's fine, but keep in mind that is also something that'll influence your measurement.

abdel-17 · July 15, 2022, 2:49pm

Yes, I’m trying to check for overflow before initializing an invalid rational value.

michelf · July 15, 2022, 2:56pm

I'll just point out that depending on what you plan to do with that rational value later, you could also initialize it regardless and carry the overflow flag with it. And thus eliminate that branch.

abdel-17 · July 15, 2022, 2:57pm

So, I changed from debug to release mode, and the results seem consistent with what I found. nil just barely takes less time than reportingOnOverflow, and throws takes more time than both.

abdel-17 · July 15, 2022, 2:58pm

That would if it weren't for the fact that I assume the denominator is always positive for certain operations, and overflow in * breaks that assumption.

tera · July 15, 2022, 3:22pm

I don't see repeatOnRandomPairs & Rational.

So I created my own repeatOnRandomPairs that first populates a table of random numbers outside of timing loop, and some basic Rational that does nothing. Changed the code a little bit so release build doesn't optimise it (basically accumulating the Rational.numerator into some variable). Here's what I'm getting (repeating this test 10000 times, so it is 1000 x 100_000 iterations in total (release build):

testNilOnOverflowPerformance: 2.6856 s
testThrowingOnOverflowPerformance: 1.4778 s
testReportingOnOverflowPerformance: 1.1672 s // *** but see below (1)

The test

import Foundation

func measure(execute: () -> Int) -> Int {
    let start = CFAbsoluteTimeGetCurrent()
    let r = execute()
    let elapsed = CFAbsoluteTimeGetCurrent() - start
    print("elapsed: \(elapsed)")
    return r
}

func multiplyNilOnOverflow(_ x: Int, _ y: Int) -> Int? {
    let (v, overflow) = x.multipliedReportingOverflow(by: y)
    if overflow { return nil }
    return v
}

func addNilOnOverflow(_ x: Int, _ y: Int) -> Int? {
    let (v, overflow) = x.addingReportingOverflow(y)
    if overflow { return nil }
    return v
}

let err = NSError(domain: "", code: -1, userInfo: nil)

func multiplyThrowingOnOverflow(_ x: Int, _ y: Int) throws -> Int {
    let (v, overflow) = x.multipliedReportingOverflow(by: y)
    if overflow { throw err }
    return v
}

func addThrowingOnOverflow(_ x: Int, _ y: Int) throws -> Int {
    let (v, overflow) = x.addingReportingOverflow(y)
    if overflow { throw err }
    return v
}

func testNilOnOverflowPerformance() -> Int {
    measure {
        repeatOnRandomPairs(count: 100_000) { x, y in
            guard let a = multiplyNilOnOverflow(x.numerator, y.denominator),
                  let b = multiplyNilOnOverflow(x.denominator, y.numerator),
                  let n = addNilOnOverflow(a, b),
                  let d = multiplyNilOnOverflow(x.denominator, y.denominator)
            else {
                fatalError()
                return 0
            }
            let r = Rational(n, d)
            return r.numerator
        }
    }
}

func testThrowingOnOverflowPerformance() throws -> Int {
    measure {
        repeatOnRandomPairs(count: 100_000) { x, y in
            do {
                let a = try multiplyThrowingOnOverflow(x.numerator, y.denominator)
                let b = try multiplyThrowingOnOverflow(x.denominator, y.numerator)
                let n = try addThrowingOnOverflow(a, b)
                let d = try multiplyThrowingOnOverflow(x.denominator, y.denominator)
                let r = Rational(n, d)
                return r.numerator
            } catch {
                fatalError()
                return 0
            }
        }
    }
}

func testReportingOnOverflowPerformance() -> Int {
    measure {
        repeatOnRandomPairs(count: 100_000) { x, y in
            let (a, o1) = x.numerator.multipliedReportingOverflow(by: y.denominator)
            let (b, o2) = x.denominator.multipliedReportingOverflow(by: y.numerator)
            let (n, o3) = a.addingReportingOverflow(b)
            let (d, o4) = x.denominator.multipliedReportingOverflow(by: y.denominator)
            let r = Rational(n, d)
            return r.numerator
        }
    }
}

struct Num {
    let numerator: Int
    let denominator: Int
}

struct Rational {
    let numerator: Int
    let denominator: Int
    
    init(_ numerator: Int, _ denominator: Int) {
        self.numerator = numerator
        self.denominator = denominator
    }
}

func initPairs() {
    let m = 100
    let M = 10000
    pairs = (0 ..< 100_000).map { _ in
        (a: Num(numerator: Int.random(in: m ..< M), denominator: Int.random(in: m ..< M)),
         b: Num(numerator: Int.random(in: m ..< M), denominator: Int.random(in: m ..< M)))
    }
}

func repeatOnRandomPairs(count: Int, execute: (_ x: Num, _ y: Num) -> Int) -> Int {
    var total = 0
    
    for _ in 0 ..< 10000 {
        for i in 0 ..< count {
            let p = pairs[i]
            total &+= execute(p.a, p.b)
        }
    }
    return total
}

var pairs: [(a: Num, b: Num)] = []

initPairs()

let a = testNilOnOverflowPerformance()
let b = try! testThrowingOnOverflowPerformance()
let c = testReportingOnOverflowPerformance() // *** but see below (1)
print()

Note, this is a standalone test, to be used outside XCTest machinery.
Also note that in testReportingOnOverflowPerformance the "overflow" is not currently accumulated (if I do this I'd use the above mentioned non short-circuiting or operator defined on Bool).

I'd second that. It can be either carried in the individual "Rational" numer/denom components (if they are structs, like the mentioned "Number" above) or be in the "Rational" struct itself. (2)

(1) as it doesn't currently report anything it should be treated as "baseline" timing: what happens if you don't do either optional or throwing reporting.

(2) However, compared to floating point built-in NAN handling, such an additional field increases the structure size, possibly by 8 bytes due to an alignment... Be careful here if you store many of those, like thousands. There might be some creative ways to avoid that overhead, but that would depend on what exactly you are doing. (e.g. one creative way would be to reduce Int range by one number, say .min, and designate it to be a NAN analogue. This may be ok for some algorithms, just not in general.)

tera · July 15, 2022, 4:16pm

Somewhat unexpectedly, if I do add overflow processing in the last test it is either about the same speed as others (with normal bool or operators) or even somewhat slower with the short circuiting bool or operators (doesn't matter which one). @michelf, any idea why is this the case? Or am I doing something wrong?

The changes

func testReportingOnOverflowPerformance() -> Int {
    let one = Rational(1, 1)
    return measure {
        repeatOnRandomPairs(count: 100_000) { x, y in
            let (a, o1) = x.numerator.multipliedReportingOverflow(by: y.denominator)
            let (b, o2) = x.denominator.multipliedReportingOverflow(by: y.numerator)
            let (n, o3) = a.addingReportingOverflow(b)
            let (d, o4) = x.denominator.multipliedReportingOverflow(by: y.denominator)
            
            // MARK: short circuiting Bool `or` via cast
            // let r = Bool(int8Value_viaCast: o1.int8Value_viaCast | o2.int8Value_viaCast | o3.int8Value_viaCast | o4.int8Value_viaCast) ? one : Rational(n, d)
            
            // MARK: short circuiting Bool `or` via EQ
            // let r = Bool(int8Value_viaEq: o1.int8Value_viaEq | o2.int8Value_viaEq | o3.int8Value_viaEq | o4.int8Value_viaEq) ? one : Rational(n, d)

            // MARK: normal Bool `or`
            let r = (o1 || o2 || o3 || o4) ? one : Rational(n, d)

            return r.numerator
        }
    }
}

extension Bool {
    var int8Value_viaCast: Int8 {
        unsafeBitCast(self, to: Int8.self)
    }
    var int8Value_viaEq: Int8 {
        self ? 1 : 0
    }
    init(int8Value_viaCast v: Int8) {
        self = unsafeBitCast(v, to: Bool.self)
    }
    init(int8Value_viaEq v: Int8) {
        self = v != 0
    }
}

Note: as I am running the tests on the same numbers, and there are "fatalError" in two other tests on the overflow path - there is no overflow occurring.

michelf · July 15, 2022, 5:18pm

I wouldn't be too surprised if the optimizer was able to produce the exact same code for the nil variant and the overflow-flag variant. Maybe checking the assembly would be a good idea.

Ultimately, Optional<Int> is an integer with a flag attached to it, pretty much the same thing as an integer with an overflow flag. The integer part of Optional<Int> can be anything (garbage) when the flag is set to nil, and the same can be said if you can observe that the value is discarded when the overflow flag is set. So ultimately, when seen at a lower level, it's the same thing, and the optimiser might choose to elide a branch if the tradeoff is one possibly unused arithmetic operation per branch; a very good tradeoff.

The question would be whether this optimisation can carry on later, when used in a bigger function with more calculations, loops, and across non-inlined function call boundaries. It might be best to assess the performance of each approach within a bigger system rather than a microbenchmark.