Missed optimization opportunity?

Jens · May 31, 2018, 8:06am

The above (and everything having to do with these issues) are too complex to communicate effectively.

So I'll try to simplify the current situation by starting from scratch, and only focus on:

the .nextDown case, and
the most recent Development Snapshot.

So make sure you are using Development Snapshot 2018-05-30 and follow these exact steps, because as you'll see, every little detail matters.

Step 1

This program will be fast (around 4e-08 seconds for me):

import AppKit

func test() {
    for _ in 0 ..< 5 {
        let a = Double(1)
        var checksum = UInt64(0)
        let t0 = CACurrentMediaTime()
        for _ in 0 ..< 1_000_000_000 {
            let v = a.nextDown
            checksum = checksum &+ v.bitPattern
        }
        let t1 = CACurrentMediaTime()
        print(t1 - t0, "seconds, checksum:", checksum)
    }
}
test()

Step 2

Now, with the following modification, the program will still be fast (4e-08 s), which might lead us to believe that the issue has been fixed, even for the top level case! (remember that we are only talking about the most recent snapshot here).

import AppKit

// func test() {
    for _ in 0 ..< 5 {
        let a = Double(1)
        var checksum = UInt64(0)
        let t0 = CACurrentMediaTime()
        for _ in 0 ..< 1_000_000_000 {
            let v = a.nextDown
            checksum = checksum &+ v.bitPattern
        }
        let t1 = CACurrentMediaTime()
        print(t1 - t0, "seconds, checksum:", checksum)
    }
// }
// test()

Step 3

But, if we move let a = Double(1) out of the (outer, trials-) for-in loop, then it will be millions of times slower (0.08 seconds for me).

import AppKit

// func test() {
    let a = Double(1) // <-- moved this 
    for _ in 0 ..< 5 {
        var checksum = UInt64(0)
        let t0 = CACurrentMediaTime()
        for _ in 0 ..< 1_000_000_000 {
            let v = a.nextDown
            checksum = checksum &+ v.bitPattern
        }
        let t1 = CACurrentMediaTime()
        print(t1 - t0, "seconds, checksum:", checksum)
    }
// }
// test()

Step 4

If we now enclose it in the func again, it will be fast (4e-08 s).

import AppKit

func test() {
    let a = Double(1) // <-- this is still here, but moving it back inside the for-in loop doesn't matter when enclosed in func.
    for _ in 0 ..< 5 {
        var checksum = UInt64(0)
        let t0 = CACurrentMediaTime()
        for _ in 0 ..< 1_000_000_000 {
            let v = a.nextDown
            checksum = checksum &+ v.bitPattern
        }
        let t1 = CACurrentMediaTime()
        print(t1 - t0, "seconds, checksum:", checksum)
    }
}
test()

Step 5

However: If we move let a = Double(1) to top level, it will be millions of times slower again (0.08 s)!

import AppKit

let a = Double(1) // <-- constant moved to top level

func test() {
    for _ in 0 ..< 5 {
        var checksum = UInt64(0)
        let t0 = CACurrentMediaTime()
        for _ in 0 ..< 1_000_000_000 {
            let v = a.nextDown
            checksum = checksum &+ v.bitPattern
        }
        let t1 = CACurrentMediaTime()
        print(t1 - t0, "seconds, checksum:", checksum)
    }
}
test()

And it doesn't matter if a is declared as a static constant within a type, ie having it like this:

struct S {
    static let a = Double(1)
}

will still be slow (0.08 s).

This demo (Step 5), especially with a static constant in a type like that, is representative of situations that are very common in real code. Yes, the billion iteration loop is looking a bit contrived, but it's just to make the demo short, and you'll have to trust me that the underlying issue(s) come(s) up again and again (in different shapes) in non-contrived non-stupid real code.

So, Step 3 and Step 5 demonstrates that some optimization is still missed for the .nextDown case (while not being missed for eg .magnitude).

Hopefully, this makes it clear that the issue still remains in dev snapshot 2018-05-30.

EDIT: I've since noted that there is a tiny modification that can be made to the program of Step 5 in order to make it fast: Replace let a = Double(1) with a formulation that avoids the initializer and integer literal, ie any of these:

let a = 1 as Double
let a: Double = 1
let a = 1.0

I think I will try to sum all of this up into a new simplified bug report since SR-7094 has become rather messy.

Btw, you can also use eg .binade instead of .nextDown to get an even bigger difference in time when comparing to .magnitude, so this is not specific to .nextDown. The issue is meant to be just one specific example of a more general issue, and it would be nice if it ended up being perceived and fixed as such, so that "resolved" would mean that I don't have to repeat this effort of bumping into, isolating, and reporting every single one (.nextUp, .nextDown, .binade, etc.). This is also why I'm still wondering if/how issues like this is being tested/benchmarked, in order to prevent regressions. Just like @xwu wondered some time ago: