Generating Double values in unit range using Random API is about 45 times slower than necessary

Jens · May 26, 2018, 10:55pm

~~Ah, ok, thanks for the explanation.~~

EDIT: After re-reading this part of your reply:

and writing the below (much smaller) reformulation of the test in my original post, I realize that my response should be:

Note that in the case of generating these unit range doubles, we use a random UInt64 value in the range
0 ..< (1 << 53). The (exclusive) upper bound here is a power of two, but it isn't one of
(1 << 8), (1 << 16), (1 << 32) or (1 << 64) as is the case for "types like UInt64, UInt32, etc".

That said, I'm sure you're correct in that it doesn't perform more than one iteration of the loop for upperBound == (1 << 53), and it's probably the % and conditional statements that takes more time than is perhaps necessary.

Anyway, the following program demonstrates the same issue as before but in a different way, which I think makes it clearer that a significant part of the slow down is due to .next(upperBound:) not being as efficient as it probably could be, for upperBounds that are powers of two, like (1 << 53) == 0x00200000_00000000.

It would be nice if `.next(upperBound:) could be rewritten to be faster for these cases (eg perhaps using the fact that .significandWidth == 0 for powers of two, and using shifting to get the right amount of bits, removing branching in favor of calculation if possible, etc.). I tried to find the post in the implementation by @scanon that you referred to but couldn't. Do you have a link?

Here is the reformulated smaller test program:

import AppKit
    
struct SplitMix64 : RandomNumberGenerator {
    // Based on http://xoshiro.di.unimi.it/splitmix64.c
    var state: UInt64
    init(seed: UInt64) { self.state = seed }
    init() { self.state = Random.default.next() }
    mutating func next() -> UInt64 {
        state &+= 0x9e3779b97f4a7c15
        var z = (state ^ (state &>> 30)) &* 0xbf58476d1ce4e5b9
        z = (z ^ (z &>> 27)) &* 0x94d049bb133111eb
        return z ^ (z &>> 31)
    }
}

func test() {
    var prng = SplitMix64(seed: 1234567)
    for _ in 0 ..< 10 {
        var checksum: Double = 0
        let t0 = CACurrentMediaTime()
        for _ in 0 ..< 10_000_000 {
            //-----------------------------------------------------------------
            // Uncomment one of the two methods (A or B) for getting a random
            // UInt64 value in the range 0 ..< (1 << 53), which is used for the
            // random Double value in unit range 0.0 ..< 1.0.
            //-----------------------------------------------------------------
            // Note that using A is much faster than using B.
            // This would not be the case if .next(upperBound:) didn't perform
            // unnecessary work for upperBounds that are powers of two.
            // (0x00200000_00000000 == 1 << 53 is a power of two.)
            //-----------------------------------------------------------------
            
            // A | min time about 0.017 seconds on my machine:
            let bp = prng.next() &>> 11 // (64 - 11 == 53)
            
            // B | min time about 0.218 seconds on my machine:
            //let bp = prng.next(upperBound: 0x00200000_00000000 as UInt64)
            
            //-----------------------------------------------------------------
            // We are by-passing `Double.random(in: 0 ..< 1, using: &prng)`:
            //-----------------------------------------------------------------
            let rndDoubleInUnitRange = Double(bp) * (.ulpOfOne / 2)
            checksum += rndDoubleInUnitRange
        }
        let t1 = CACurrentMediaTime()
        print("time:", t1 - t0, "seconds (checksum: \(checksum)")
    }
}

test()

As a side-note, look at the error that you get if you remove "as UInt64" from the following:

let bp = prng.next(upperBound: 0x00200000_00000000 as UInt64)

I saw here that you were wondering about what seems to be the same misleading diagnostics. So I thought you might find it interesting. I filed SR-7786 for this.