For performance, nil or throws?

I need to perform integer arithmetic that signals on overflow. I know operationReportingOverflow functions exist, but the boilerplate involved with checking the returned Boolean each time is annoying. Throwing an error or returning nil is much easier to work with, but which is better for performance? Is there even a difference?

Swift's errors aren't exceptions.

They're essentially just syntax sugar for returning a Result<Success, Failure> (not actually, but conceptually) and having the caller branch on the result.

It's not much different from doing your own nil-checking on a returned optional.

The only difference I can think of is that error throwing might involve a heap allocation (for allocated an existential container for the Error object), though I think the optimizer can fix that up for calls within the same module, where the thrown type is always the same.

3 Likes

How are Swift's errors different from exceptions? They behave pretty much the same as Java's exceptions, for example.

From the Error Handling section in the Swift Programming Language book:

NOTE
Error handling in Swift resembles exception handling in other languages, with the use of the try, catch and throw keywords. Unlike exception handling in many languages—including Objective-C—error handling in Swift doesn’t involve unwinding the call stack, a process that can be computationally expensive. As such, the performance characteristics of a throw statement are comparable to those of a return statement.

10 Likes

Perhaps syntactically, but they're actually completely different at the implementation level. The implementation details then have some consequences for how exactly the feature works.

In Java, any arbitrary method can potentially throw an exception. Suppose you had:

int callee() {
    // throw new RuntimeException();
    return 123;
}

void caller() {
    try {
        var someResource = openSomeResource();
        var result = callee();
        System.out.println(result);
    } finally {
        someResource.close();
    }
}

In the happy path, nothing needs to done by caller. It calls callee, gets the return result, and it goes on to print, close the resource, and exit.

But on the off-chance that the callee code throws an exception, the program needs to know how to exit out of the caller stackframe, while preserving the semantics of the program. The tricky part is that finally clause, and making sure it's executed, even if callee() unexpectedly throws an exception.

To do this, the compiler needs to produce some code which explains how to gracefully exit out of caller() from any place where an exception might be thrown (i.e., every single method call). This code is set aside, and and is never hit in the happy path. In the case where an exception is thrown, it's run. This process is called stack unwinding.

  • Pro: the happy path is fast
  • Pro: any callee code can raise an exception, and all will be fine
  • Con: the unhappy path is much slower
  • Con: the stack unwinding code causes code bloat, whether needed or not

In comparison, Swift's error-throwing mechanism is much like a return, with some implicit branching done by the caller. Because throws effects the calling convention (i.e. the caller needs to do something different when calling a throwing function), it's explicitly stated on a function using the keyword.

This is why you can only throw an error out of a function marked throws (if it wasn't marked throws and you were allowed to throw anyway, the caller wouldn't know to check for an error, and nothing would work).

  • Pro: the happy path is pretty fast (only one branch, which high predictability)
  • Pro: the unhappy path is equally fast (only one branch)
  • Pro: no stack unwinding code
  • Con: only throw-marked functions can throw, so you can't universally throw everywhere.
    • As a result, you also can't throw "through" non-throwing code. If your parent is non-throwing, then you must handle the error with try?, try! or catch. You can't just re-throw it, because the parent can't handle it.
4 Likes

Note, if you're going to store a get accessor, it's unfortunately better for performance to model that with a Result rather than a throwing closure.

I.e. use this:

struct WithResult<Property> {
  private let propertyResult: Result<Property, Error>

  var property: Property {
    get throws { try propertyResult.get() }
  }

instead of this:

struct WithGet<Property> {
  private let getProperty: () throws -> Property

  var property: Property {
    get throws { try getProperty() }
  }
1 Like

Interesting, this is the first I'm hearing of this. Got any context you can share?

I don't know if this part of a larger pattern, but we found performance degrade quite a bit when changing our parsing library from returning optionals to throwing:

This is the PR. We haven't done too much research into why this happened

If there is indeed a heap allocation that could be critical (and even a show stopper in some cases).

No need to do this after every operation. As with floats handling of NAN's you can proceed with the operation propagating overflow and only checking it at the upper level:

struct Number {
    let value: Int
    let overflow: Bool
    
    init(_ value: Int, overflow: Bool = false) {
        self.value = value
        self.overflow = overflow
    }
    init(_ v: (Int, Bool), overflow: Bool = false) {
        self.init(v.0, overflow: v.1 || overflow)
    }

    static func + (a: Number, b: Number) -> Number {
        Number(a.value.addingReportingOverflow(b.value), overflow: a.overflow || b.overflow)
    }
    static func * (a: Number, b: Number) -> Number {
        Number(a.value.multipliedReportingOverflow(by: b.value), overflow: a.overflow || b.overflow)
    }
    // etc
}

func someComputation(a: Number, b: Number) -> Number {
    a + b * a
}

func test() {
    let a = Number(1)
    let b = Number(2)
    let c = someComputation(a: a, b: b)
    print(c.value, c.overflow)
}

A minor benefit would be having partial results in case of overflow, and a bigger advantage would be somewhat cleaner syntax (no "!" nor "?" nor "try").

2 Likes

I think it was Joe Groff who told me to switch approaches, but it was years ago. I don't have a test designed for it but I'd find it interesting to maintain one—I'm open to being educated on that.

1 Like

The problem with this approach is that it does unnecessary work if more than one result overflows. I guess I’ll stick with throwing errors for now.

The earlier edit of that code did an early return, if you prefer that:

    static func + (a: Number, b: Number) -> Number {
        if a.overflow { return a }
        if b.overflow { return b }
        return Number(a.value.addingReportingOverflow(b.value))
    }

I just thought the "unhappy" path could be left non-optimised.

I would assume the version with two if would perform worse in all cases. Branches are in general much more costly than arithmetic and logical operations due to pipelining in the processor. It's a bit counterintuitive, but often it's more efficient to do more work if it allows you to skip branches.

Which means it should be more performant to not throw and also not return nil. Throwing will create implicit branches to handle thrown errors, and returning nil forces you to write explicit branches to handle the nil case. In some cases the optimizer will be able to remove branches, but often it won't.

This brings me back to the version with a.overflow || b.overflow. The short-circuiting behavior of || creates a branch here. I assume the optimizer is easily able to remove it in the (Bool, Bool) case, but if it wasn't it would be more performant to store the flag in an integer and use | which is a purely logical operation.

I'd think carrying an overflow flag and not short-circuiting calculations with branches is the best approach. If the calculation is really long and likely enough to overflow that you can spare time for branches, it might be worth adding one or two branches somewhere the middle to stop it early when the overflow flag is set. But each branch has a cost, and branching at every arithmetic operation is way overkill and will surely worsen performance.

Finally, branches can introduce timing attacks in crypto systems or more generally in code protecting secrets. If that's what's being written here, avoiding branches is a very good idea.


Note: I'm just spitting out theory here. I don't have much experience with all this. Always benchmark.

1 Like

Was exactly my thought, actually. I was about to introduce a non shortcutting "|" operation working on Bools.

BTW, is it possible to go from Bool to Int and from Int to Bool without branches?
hmm, table lookup can be used going from Bool to Int (sic!) :slight_smile: but that doesn't seem quick.

found it :-)

extension Bool {
    static func | (a: Bool, b: Bool) -> Bool {
        let a = unsafeBitCast(a, to: Int8.self)
        let b = unsafeBitCast(b, to: Int8.self)
        return unsafeBitCast(a | b, to: Bool.self)
    }
    static func & (a: Bool, b: Bool) -> Bool {
        let a = unsafeBitCast(a, to: Int8.self)
        let b = unsafeBitCast(b, to: Int8.self)
        return unsafeBitCast(a & b, to: Bool.self)
    }
}

prettified:

extension Bool {
    var int8Value: Int8 {
        unsafeBitCast(self, to: Int8.self)
    }
    init(int8Value: Int8) {
        self = unsafeBitCast(int8Value, to: Bool.self)
    }

    static func | (a: Bool, b: Bool) -> Bool {
        Bool(int8Value: a.int8Value | b.int8Value)
    }
    static func & (a: Bool, b: Bool) -> Bool {
        Bool(int8Value: a.int8Value & b.int8Value)
    }
}
The whole thing (untested).
// Bool+Extensions.swift

extension Bool {
    var int8Value: Int8 {
        unsafeBitCast(self, to: Int8.self)
    }
    init(int8Value: Int8) {
        self = unsafeBitCast(int8Value, to: Bool.self)
    }

    static func | (a: Bool, b: Bool) -> Bool {
        Bool(int8Value: a.int8Value | b.int8Value)
    }
    static func & (a: Bool, b: Bool) -> Bool {
        Bool(int8Value: a.int8Value & b.int8Value)
    }
}

// Number.swift

struct Number<T: FixedWidthInteger> {
    let value: T
    let overflow: Bool
    
    init(_ value: T, overflow: Bool = false) {
        self.value = value
        self.overflow = overflow
    }
    init(_ v: (T, Bool), overflow: Bool = false) {
        self.init(v.0, overflow: v.1 | overflow)
    }
    
    static prefix func + (a: Number) -> Number {
        a
    }
    
    static prefix func - (a: Number) -> Number {
        Number(0) - a
    }
    
    static func + (a: Number, b: Number) -> Number {
        Number(a.value.addingReportingOverflow(b.value), overflow: a.overflow | b.overflow)
    }
    static func - (a: Number, b: Number) -> Number {
        Number(a.value.subtractingReportingOverflow(b.value), overflow: a.overflow | b.overflow)
    }
    static func * (a: Number, b: Number) -> Number {
        Number(a.value.multipliedReportingOverflow(by: b.value), overflow: a.overflow | b.overflow)
    }
    static func / (a: Number, b: Number) -> Number {
        Number(a.value.dividedReportingOverflow(by: b.value), overflow: a.overflow | b.overflow)
    }
    static func % (a: Number, b: Number) -> Number {
        Number(a.value.remainderReportingOverflow(dividingBy: b.value), overflow: a.overflow | b.overflow)
    }
    
    static func += (a: inout Number, b: Number) {
        a = a + b
    }
    static func -= (a: inout Number, b: Number) {
        a = a - b
    }
    static func *= (a: inout Number, b: Number) {
        a = a * b
    }
    static func /= (a: inout Number, b: Number) {
        a = a / b
    }
    static func %= (a: inout Number, b: Number) {
        a = a % b
    }
}

// Test.swift

func someComputation(a: Number<Int>, b: Number<Int>) -> Number<Int> {
    a + b * a
}

func test() {
    let a = Number(1)
    let b = Number(2)
    let c = someComputation(a: a, b: b)
    print(c.value, c.overflow)
}
1 Like

This is what I'd do:

let bool = int != 0 // a comparison, not a branch
let int = bool ? 1 : 0 // trivial to optimize the branch away

Personally I'd be wary of unsafeBitCast. I worry it'll remove other optimization opportunities in the surrounding context. I'm also pretty sure having two already-evaluated boolean values should make the branch easily replaced by a simple logical operation in the optimization pass.

Ideally you'd implement each approach and measure. And maybe look at the generated assembly to check for hidden branches and other differences. We're just speculating otherwise.

1 Like

FWIW, this is the asm I'm getting:

extension Bool {
    var int8Value_viaCast: Int8 {
        unsafeBitCast(self, to: Int8.self)
    }
//  mov     eax, edi
//  and     al, 1
//  ret

    var int8Value_viaEq: Int8 {
        self ? 1 : 0
    }
//  mov     eax, edi
//  and     al, 1
//  ret

    init(int8Value_viaCast v: Int8) {
        self = unsafeBitCast(v, to: Bool.self)
    }
//  mov     eax, edi
//  and     al, 1
//  ret

    init(int8Value_viaEq v: Int8) {
        self = v != 0
    }
//  test    dil, dil
//  setne   al
//  ret
}

Interestingly it is identical asm to go from int to bool and from bool to int :slight_smile:

The "test + set" - I am not familiar with asm too much, I guess it is very fast, just curious is that slightly slower than "and" ? Obviously we are splitting hair at this point :slight_smile:

So I wrote some test functions in an XCTestCase class (not sure if this is the correct way to benchmark) for adding two rational numbers.

func testNilOnOverflowPerformance() throws {
        self.measure {
            repeatOnRandomPairs(count: 100_000) { x, y in
                guard let a = multiplyNilOnOverflow(x.numerator, y.denominator),
                      let b = multiplyNilOnOverflow(x.denominator, y.numerator),
                      let n = addNilOnOverflow(a, b),
                      let d = multiplyNilOnOverflow(x.denominator, y.denominator)
                else { return }
                _ = Rational(n, d)
            }
        }
    }
    
    func testThrowingOnOverflowPerformance() throws {
        self.measure {
            repeatOnRandomPairs(count: 100_000) { x, y in
                do {
                    let a = try multiplyThrowingOnOverflow(x.numerator, y.denominator)
                    let b = try multiplyThrowingOnOverflow(x.denominator, y.numerator)
                    let n = try addThrowingOnOverflow(a, b)
                    let d = try multiplyThrowingOnOverflow(x.denominator, y.denominator)
                    _ = Rational(n, d)
                } catch {
                    return
                }
            }
        }
    }
    
    func testReportingOnOverflowPerformance() throws {
        self.measure {
            repeatOnRandomPairs(count: 100_000) { x, y in
                let (a, o1) = x.numerator.multipliedReportingOverflow(by: y.denominator)
                let (b, o2) = x.denominator.multipliedReportingOverflow(by: y.numerator)
                let (n, o3) = a.addingReportingOverflow(b)
                let (d, o4) = x.denominator.multipliedReportingOverflow(by: y.denominator)
                guard !o1 && !o2 && !o3 && !o4 else { return }
                _ = Rational(n, d)
            }
        }
    }

The nil approach averaged 1.057 seconds, throws averaged 1.678 seconds, and the regular old ${operation}ReportingOnOverflow averaged 1.068 seconds. Seems nil works best, but I need someone to confirm my tests are correct.

Excuse me if I'm stating the obvious,

but are you sure you're running your test suite in release mode with performance optimizations enabled?

Are optimizations enabled in SwiftPM test targets?

No. They're meant to compile fast for a quicker development feedback loop.