Improving Float.random(in:using:)

Jens · January 31, 2020, 12:00pm

I currently need a function that

given access to (enough) uniformly distributed (pseudo)random bits,
choose a random value from a continuous uniform distribution within a given range (including the full range of finite Float values), and then convert that value to the nearest representable Float value. Depending on the size and span of range, some concrete values may be represented more frequently than others, but any Float value within the range can be returned.

And since that's the goal stated in the documentation for Float.random(in:using:), but not exactly what it currently does, and:

I wonder if there's a working updated implementation somewhere that I can look at?

If not (and unless it won't be within days) I'd just like to know if anyone can give me some pointers as to how this function could/should work (roughly), because I might try to implement it myself, just to cover my immediate needs^*, ie not for the stdlib.

I can post any possibly resulting code here though, should anyone be interested in giving it a critical eye or so.

Here's an example of a related paper I found.

^* My immediate needs has to do with (reproducible) testing, mainly geometry, image processing and physics code. I use a small custom pseudorandom API and not the one in the stdlib, because I need full control over its operation and efficiency.

Nevin · January 31, 2020, 1:02pm

When the proposal to add randomness to the standard library was under review, I sketched out a possible implementation, though it’s not quite perfect as I noted in the comments.

Jens · January 31, 2020, 2:00pm

I (think I've) implemented the method described in the paper I mentioned in the OP:

func pseudorandomFloatInClosedUnitRange<R>(using prng: inout R) -> Float
    where R: PseudorandomNumberGenerator
{
    // Start by choosing an exponent (with correct probability):
    var rnd64 = prng.next()
    var i: UInt8 = 0
    while (rnd64 >> i) & 1 != 0 && i < 126 {
        i += 1
        if i & 63 == 0 { rnd64 = prng.next() }
    }
    var exp = UInt8(126) &- i
    // Note: Can be at most 126, not 127 (as in exp of 1). This is because
    // we have to increment it with 50% probability if the random
    // significand happens to be zero (see below).

    // Choose a random 23-bit mantissa (and waste some rnd bits for now):
    rnd64 = prng.next()
    let significand = UInt32(truncatingIfNeeded: rnd64) & 0x7FFFFF

    // If it's zero, then with 50% probability we should move to
    // the next exponent range:
    if significand == 0 && (rnd64 >> 23) & 1 != 0 { exp &+= 1 }

    return Float(bitPattern: (UInt32(exp) << 23) | significand)
}

But this is only for the closed range 0 ... 1.

Assuming this works as intended, I wonder how it can be extended to any range
(including -Float.greatestFiniteMagnitude ... Float.greatestFiniteMagnitude).

Can it somehow be used (I guess it can't be just scaled and translated) for other ranges or do we have to use a completely different algorithm?

Will there be a problem with -0 and 0 when using ranges that span from - to + etc ...

And I have no idea how "good" this method is compared to others (as I haven't really found that many, let alone a comparison between them).

Here's the function again, in a complete program that can be compiled (swiftc -O test.swift) and run.

import func QuartzCore.CABase.CACurrentMediaTime


//-----------------------------------------------------------------------------
// Some stuff that it uses:
//-----------------------------------------------------------------------------

protocol PseudorandomNumberGenerator {
    associatedtype State
    var state: State { get }

    /// Creates a new instance of this generator with a state that is
    /// completely determined by the given seed. Any `UInt64` is allowed and
    /// maps to a valid state. Each seed may or may not map to a unique state
    /// depending on the generator.
    init(seed: UInt64)

    /// Creates a new instance of this generator with the given state.
    /// The initializer fails if the given state is invalid for this generator.
    init?(state: State)

    /// Advances the state of the generator and returns the next random bit
    /// pattern.
    mutating func next() -> UInt64
}


extension PseudorandomNumberGenerator {

    /// Creates a new instance of this generator by calling the required seeded
    /// initializer `init(seed: UInt64)` with a truly random seed.
    init() {
        self.init(seed: UInt64.random(in: .min ... .max))
    }

    /// Creates a new instance of this generator by calling the required seeded
    /// initializer with the given `seed` or a truly random seed if it's nil.
    init(seed: UInt64?) {
        if let seed = seed {
            self.init(seed: seed)
        } else {
            self.init()
        }
    }
}


/// The wyrand generator, translated from:
/// https://github.com/wangyi-fudan/wyhash
struct WyRand : PseudorandomNumberGenerator {

    typealias State = UInt64

    private (set) var state: State

    init(seed: UInt64) {
        //print("WyRand(seed: \(seed))")
        self.state = seed
    }
    init(state: State) {
        // Every UInt64 value is a valid WeRand state, so no need for this
        // initializer to be failable, it will satisfy the requirement anyway.
        self.state = state
    }

    mutating func next() -> UInt64 {
        state &+= 0xa0761d6478bd642f
        let mul = state.multipliedFullWidth(by: state ^ 0xe7037ed1a0b428db)
        return mul.high ^ mul.low
    }
}


//-----------------------------------------------------------------------------
// The function
//-----------------------------------------------------------------------------

func pseudorandomFloatInClosedUnitRange<R>(using prng: inout R) -> Float
    where R: PseudorandomNumberGenerator
{
    // Choose an exponent (with correct probability):
    var rnd64 = prng.next()
    var i: UInt8 = 0
    while (rnd64 >> i) & 1 != 0 && i < 126 {
        i += 1
        if i & 63 == 0 { rnd64 = prng.next() }
    }
    var exp = UInt8(126) &- i
    // Note can be at most 126, not 127 (as in exp of 1). This is because
    // we have to increment it with 50% probability if the random
    // significand happens to be zero (see below).

    // Choose a random 23-bit mantissa (and waste some rnd bits for now):
    rnd64 = prng.next()
    let significand = UInt32(truncatingIfNeeded: rnd64) & 0x7FFFFF

    // If it's zero, then with 50% probability we should move to
    // the next exponent range:
    if significand == 0 && (rnd64 >> 23) & 1 != 0 { exp &+= 1 }

    return Float(bitPattern: (UInt32(exp) << 23) | significand)
}

//-----------------------------------------------------------------------------
// Demo
//-----------------------------------------------------------------------------

func test() {
    var prng = WyRand()
    let numRuns = 10
    let floatsPerRun = 10_000_000
    var meanSum = Double(0)
    for _ in 0 ..< numRuns {
        let t0 = CACurrentMediaTime()
        var sum = Double(0)
        for _ in 0 ..< floatsPerRun {
            let v = pseudorandomFloatInClosedUnitRange(using: &prng)
            sum += Double(v)
        }
        let t1 = CACurrentMediaTime()
        print(t1 - t0, "s | mean: \(sum / Double(floatsPerRun))")
        meanSum += sum
    }
    print("mean mean", meanSum / Double(floatsPerRun * numRuns))
}
test()

Nevin · January 31, 2020, 2:43pm

Well, that method is easy to extend to intervals bounded by powers of 2.

It’s worth noting that the first part, picking an exponent, is just counting the trailing ones of the random number. It would be equally valid to count the trailing zeros, and the Swift standard library provides an easy way to do exactly that (trailingZeroBitCount).

Jens · January 31, 2020, 2:55pm

Ah, of course! I just copied that from the paper without thinking ...

Here's the improved function:

func pseudorandomFloatInClosedUnitRange<R>(using prng: inout R) -> Float
    where R: PseudorandomNumberGenerator
{
    // Start by choosing an exponent (with correct probability):
    var exp = 126 &- prng.next().trailingZeroBitCount
    if exp == 62 { exp &-= prng.next().trailingZeroBitCount }
    if exp < 0 { exp = 0 }
    // Note that `exp` be at most 126, not 127 (as in exp of 1). This is
    // because we have to increment it with 50% probability if the random
    // significand happens to be zero (see below).

    // Choose a random 23-bit mantissa (and waste some rnd bits for now):
    let rnd64 = prng.next()
    let significand = UInt32(truncatingIfNeeded: rnd64) & 0x7FFFFF

    // If it's zero, then with 50% probability we should move to
    // the next exponent range:
    if significand == 0 && (rnd64 >> 23) & 1 != 0 { exp &+= 1 }

    return Float(bitPattern: (UInt32(exp) << 23) | significand)
}

The times per run reported by the test program (new version below) changed from about 0.11 seconds to 0.04 s (on my old MBP, one run consists of generating and adding 10 million Float values).

I added two simpler functions for comparison (that do not do what we want)

One that uses mimics what's currently in Float.random(in:using:) but simplified to work only for 0 ..< 1 (while wasting 40 random bits per generated float)
One that just returns 0 or 1 (while wasting 63 random bits).

If we use the zero-or-one function as a baseline, the improvement you mentioned made it (0.11 - 0.011) / (0.04 - 0.011) = 3.4 times faster.

Here's the complete updated program.

import func QuartzCore.CABase.CACurrentMediaTime


//-----------------------------------------------------------------------------
// MARK: - Some scaffolding and the PRNG we'll use
//-----------------------------------------------------------------------------

/// A type that can produce a reproducible stream of uniformly distributed
/// pseudorandom bits.
protocol PseudorandomNumberGenerator {

    associatedtype State
    var state: State { get }

    /// Creates a new instance of this generator with a state that is
    /// completely determined by the given seed. Any `UInt64` is allowed and
    /// maps to a valid state. Each seed may or may not map to a unique state
    /// depending on the generator.
    init(seed: UInt64)

    /// Creates a new instance of this generator with the given state.
    /// The initializer fails if the given state is invalid for this generator.
    init?(state: State)

    /// Advances the state of the generator and returns the next 64 uniformly
    /// distributed pseudorandom bits.
    mutating func next() -> UInt64
}


extension PseudorandomNumberGenerator {

    /// Creates a new instance of this generator by calling the required seeded
    /// initializer with the given `seed` or a truly random seed if given `nil`.
    init(seed: UInt64? = nil) {
        let seed = seed ?? UInt64.random(in: .min ... .max)
        self.init(seed: seed)
    }
}


/// The wyrand generator, translated from:
/// https://github.com/wangyi-fudan/wyhash
struct WyRand : PseudorandomNumberGenerator {

    typealias State = UInt64

    private (set) var state: State

    init(seed: UInt64) {
        self.state = seed
    }
    init(state: State) {
        // Every UInt64 value is a valid WeRand state, so no need for this
        // initializer to be failable, it will satisfy the requirement anyway.
        self.state = state
    }

    mutating func next() -> UInt64 {
        state &+= 0xa0761d6478bd642f
        let mul = state.multipliedFullWidth(by: state ^ 0xe7037ed1a0b428db)
        return mul.high ^ mul.low
    }
}



//-----------------------------------------------------------------------------
// MARK: - The function we're trying to write
//-----------------------------------------------------------------------------

/// This function chooses a pseudorandom value from a continuous uniform
/// distribution within the closed range `0 ... 1` and then convert that value
/// to the nearest representable Float value.
///
/// It uses the method described in:
/// http://allendowney.com/research/rand/downey07randfloat.pdf
///
/// # TODO:
///
///  * Check that it work as intended. I have not analyzed it more than
///    by using the simple `test()` function at the end of this program.
///
///  * Make it work for any other range, eg
///    `0 ..< 1` (half open instead of closed) and
///    `-Float.greatestFiniteMagnitude ... Float.greatestFiniteMagnitude`.
///
/// - Parameters:
///   - prng: The pseudorandom number generator to use.
///
func pseudorandomFloatInClosedUnitRange<R>(using prng: inout R) -> Float
    where R: PseudorandomNumberGenerator
{
    // Start by choosing an exponent (with correct probability):
    var exp = 126 &- prng.next().trailingZeroBitCount
    if exp == 62 { exp &-= prng.next().trailingZeroBitCount }
    if exp < 0 { exp = 0 }
    // Note that `exp` be at most 126, not 127 (as in exp of 1). This is
    // because we have to increment it with 50% probability if the random
    // significand happens to be zero (see below).


    // Choose a random 23-bit mantissa (and waste some rnd bits for now):
    let rnd64 = prng.next()
    let significand = UInt32(truncatingIfNeeded: rnd64) & 0x7FFFFF

    // If it's zero, then with 50% probability we should move to
    // the next exponent range:
    if significand == 0 && (rnd64 >> 23) & 1 != 0 { exp &+= 1 }

    return Float(bitPattern: (UInt32(exp) << 23) | significand)
}


//-----------------------------------------------------------------------------
// MARK: - Two functions that are similar but not what we want, for comparison
//-----------------------------------------------------------------------------

/// This function returns a value in the half open range `0 ..< 1` using the
/// method that is currently (2020-01-30) in the stdlib, ie the one which will
/// only choose between 2**24 values evenly spaced across the range, 0.5.ulp
/// apart, meaning many representable values will never be returned.
///
/// This is probably sufficient for most purposes (might even be preferrable
/// since it's more efficient). But there are cases where it's nice to be able
/// to produce *any* representable `Float` value within any finite range.
///
/// - Parameters:
///   - prng: The pseudorandom number generator to use.
///
func pseudorandomFloatInHalfOpenUnitInterval<R>(using prng: inout R) -> Float
    where R: PseudorandomNumberGenerator
{
    return Float(prng.next() &>> 40) * (Float.ulpOfOne/2)
}


/// This function simply returns 0 or 1 with equal probability.
///
/// - Parameters:
///   - prng: The pseudorandom number generator to use.
///
func pseudorandomFloatZeroOrOne<R>(using prng: inout R) -> Float
    where R: PseudorandomNumberGenerator
{
    return Float(prng.next() & 1)
}


//-----------------------------------------------------------------------------
// MARK: - Demo
//-----------------------------------------------------------------------------

func test() {
    var prng = WyRand()
    let numRuns = 10
    let floatsPerRun = 10_000_000
    var meanSum = Double(0)
    var minTimePerRun = Double.infinity
    for _ in 0 ..< numRuns {
        let t0 = CACurrentMediaTime()
        var sum = Double(0)
        for _ in 0 ..< floatsPerRun {

            // Uncomment the function to test here:

            // The function that returns *any* representable value in [0...1]:
            let v = pseudorandomFloatInClosedUnitRange(using: &prng)
            // 0.036 seconds per run on my machine.

            // The function that returns *some* representable value in [0..<1]:
            // let v = pseudorandomFloatInHalfOpenUnitInterval(using: &prng)
            // 0.013 seconds per run on my machine.

            // The function that returns 0 or 1, used as a baseline:
            // let v = pseudorandomFloatZeroOrOne(using: &prng)
            // 0.011 seconds per run on my machine.

            sum += Double(v)
        }
        let t1 = CACurrentMediaTime()
        let timeForThisRun = t1 - t0
        minTimePerRun = min(minTimePerRun, timeForThisRun)
        print(t1 - t0, "s | Mean float value: \(sum / Double(floatsPerRun))")
        meanSum += sum
    }
    print("Total mean float value:", meanSum / Double(floatsPerRun * numRuns))
    print("Minimum time per run:", minTimePerRun)
}
test()

Nevin · January 31, 2020, 5:31pm

I’m going off-topic here, and I’m not an expert on benchmarking, but I have to wonder if QuartzCore.CABase.CACurrentMediaTime is really the recommended way to measure time. I don’t see any alternative in the standard library, but Foundation does have a clock() function.

Is there some reason not to use Foundation.clock()?

scanon · January 31, 2020, 5:41pm

FWIW, my default choice for benchmarking on Darwin is mach_absolute_time(). It's slightly quirky in that you have to query mach_timebase_info(...) to convert the result to ns, but a lovely lightweight thing otherwise.

Jon_Shier · January 31, 2020, 6:12pm

Benchmarks I’ve seen usually show CACurrentMediaTime as the fastest option that doesn’t require conversion like mach_absolute_time. I usually use CFAbsoluteTimeGetCurrent() since it doesn’t require a separate import.

Jens · January 31, 2020, 9:06pm

I've "investigated" this a number of times and landed in CACurrentMediaTime which if I remember correctly is essentially a wrapper around mach_absolute_time(), doing the conversion from CPU ticks to seconds for you, and thereby being some fraction of a microsecond slower than mach_absolute_time(). But I'm happy to change my habit if there's a problem with it. I doubt that the efficiency and/or accuracy of the timing function would make much difference in this particular case though. According to my experience, optimization glitches depending on seemingly unrelated context in various crazy ways is still much more of an issue in this type of benchmarking.

Nevin · January 31, 2020, 9:20pm

Oh I wasn’t doubting the accuracy, just questioning the verbosity and discoverability. If Foundation.clock() isn’t good enough, then Swift should either fix it or add something that is good enough with an equally appealing spelling.

benrimmington · February 1, 2020, 4:45am

Possible alternatives to mach_absolute_time():

clock_gettime_nsec_np(CLOCK_UPTIME_RAW) — a non-portable Darwin extension in <time.h>
DispatchTime.now().uptimeNanoseconds
ProcessInfo.processInfo.systemUptime — except the implementation for Windows uses a lower-resolution clock?

benrimmington · February 1, 2020, 5:20am

I think Foundation.clock() is re-exported from the standard C library.

It measures processor time, used by the calling process, in microseconds.

import Darwin

clock()

clock_gettime_nsec_np(CLOCK_PROCESS_CPUTIME_ID) / 1000

Jean-Daniel · February 1, 2020, 8:15am

IHMO, you should never use system clock for benchmark, and always use a monotonic clock.

System time has no guarantee to be monotonic, and is usually less accurate and far slower to query.

Which means that anything that rely on system clock should be ban from benchmarking (ie: CFAbsoluteTimeGetCurrent).

Jon_Shier · February 1, 2020, 8:38am

Few benchmarks need to worry about that. Any jumps in the system clock should be both obvious and smoothed by repeated runs. In any case, I wasn’t recommending a time source for highly sensitive benchmarks, just a convenient one that’s performant.

Jens · February 1, 2020, 9:14am

I went ahead and created a separate thread for the timing discussion. : )

Nevin · February 2, 2020, 6:45am

I made an attempt at generalizing and genericizing your function, so it will work with any floating-point type that meets the constraints, and it can use any powers of 2 (as well as 0) for its bounds.

The call-sites look like this:

let x = Float.random(inBinades: 0...1)              // Equivalent to your function
let y = Double.random(inBinades: 0.25...0x1p16)     // Any powers of 2 as bounds
let z = Float80.random(inBinades: 0 ... .infinity)  // The whole shebang

Generic `random(inBinades:)` implementation

extension BinaryFloatingPoint where RawSignificand: FixedWidthInteger, RawExponent: FixedWidthInteger {
  static func random(inBinades binades: ClosedRange<Self>) -> Self {
    var generator = SystemRandomNumberGenerator()
    return random(inBinades: binades, using: &generator)
  }
  
  static func random<T: RandomNumberGenerator>(inBinades binades: ClosedRange<Self>, using generator: inout T) -> Self {
    let (low, high) = (binades.lowerBound, binades.upperBound)
    precondition(low >= 0)
    precondition(low.significandBitPattern == 0)
    precondition(high.significandBitPattern == 0)
    
    let (lowExp, highExp) = (low.exponentBitPattern, high.exponentBitPattern)
    if lowExp == highExp { return low }
    
    let significand = RawSignificand.random(in: 0 ... (1 << significandBitCount) &- 1, using: &generator)
    var exponent = highExp &- lowExp
    
    if lowExp != 0 && exponent < UInt64.bitWidth {
      let r = UInt64.random(in: 1 ... (1 &<< exponent) &- 1, using: &generator)
      let n = UInt64.bitWidth &- r.leadingZeroBitCount
      exponent = numericCast(n)
    } else {
      while exponent > 0 {
        let r = generator.next()
        exponent &-= min(exponent, numericCast(r.leadingZeroBitCount))
        if r != 0 { break }
      }
    }
    
    if significand == 0 && Bool.random(using: & generator) { exponent &+= 1 }
    if exponent == 0 {
      if lowExp == 0 { exponent = 1 }
      else { return random(inBinades: binades, using: &generator) }
    }
    exponent &+= lowExp &- 1
    
    return Self(sign: .plus, exponentBitPattern: exponent, significandBitPattern: significand)
  }
}

The expression 1 << significandBitCount could technically overflow if a floating-point type uses all of the bits in its RawSignificand (ie. if significandBitCount == RawSignificand.bitWidth), but none of the standard types do so.

I think I’ve avoided even that possibility now.

Edit: made some improvements to the implementation.

Karl · February 2, 2020, 10:32am

I think you need to pass the generator along to Bool.

Jens · February 2, 2020, 10:41am

Yes, or just use a bit, to avoid coupling this implementation with that of Bool.random(...).

Nevin · February 2, 2020, 3:15pm

Thanks, I just fixed that, and one other thing.

There’s probably a more elegant way to write this, I was just hacking on it last night for fun.

And I’m still hoping there’s a good way to make it work for arbitrary intervals, not just between binades.

Jens · February 2, 2020, 3:36pm

That would be very cool!

One thing that I cannot quite understand about the method we use (as described in the paper), is why the values of the lower and upper bound of the range should occur with 50% lower probability than all the other values in between.

I might see how/why it makes sense at in-range exponent boundaries, but why should it make sense for the lowest and highest values of the full range, and how does it make sense at the exponent boundary between subnormals and normals (these two have the same .ulp, which is not the case for any other exponents)?

It's easier to explain with an example:

Let's say we only want to generate values within a single exponent range, the one with .ulp == 1, ie these values:

  8388608.0 == Float(1 << 23) // This has .ulp == 1.0, and so have all these:
  8388609.0 == Float(1 << 23).nextUp
  8388610.0 == Float(1 << 23).nextUp.nextUp
  8388611.0 == ...
        ...
 16777215.0 == Float(1 << 24).nextDown == Float((1 << 24) - 1)
 16777216.0 == Float(1 << 24) // (But this last one has ulp == 2.0 ...)

And we build a histogram (or rather just the lower and upper ends of a histogram (because the whole histogram would take up ((1 << 23) + 1)*8 == 64 MB + 8 Bytes), like so:

func test() {
    var generator = WyRand()
    let floatCount = ((1 << 23) + 1) * 500 // Meaning about 500 samples/bin, no?
    var lowerBins = [Int](repeating: 0, count: 16)
    var upperBins = [Int](repeating: 0, count: 16)
    for i in 0 ..< floatCount {
        let v = Float.random(inBinades: Float(1 << 23) ... Float(1 << 24),
                             using: &generator)
        let binIndex = Int(v)
        if binIndex < (1 << 23) + 16 { lowerBins[binIndex - (1 << 23)] += 1 }
        if binIndex >= (1 << 24) - 15 { upperBins[binIndex - ((1 << 24) - 15)] += 1 }
        if i & ((1 << 28) &- 1) == 0 {
            print("Count down: ", (floatCount - (i + 1)) / (1 << 28), "…")
        }
    }
    print("Lower 16 bins:")
    for i in 0 ..< 16 { print(i + (1 << 23), lowerBins[i]) }
    print("Upper 16 bins:")
    for i in 0 ..< 16 { print(i + ((1 << 24) - 15), upperBins[i]) }
}

Then we can see what I mean:

Lower 16 bins:
8388608 220
8388609 475
8388611 507
8388612 483
8388613 477
8388614 521
...
...
Upper 16 bins:
...
...
16777211 485
16777212 478
16777210 529
16777214 467
16777215 508
16777216 242

The first and last value in the range will occur only half as often as the other values.

Why shouldn't all values occur about 500 times?

And I don't like that the method seems to be built around including the upper bound value, it would feel better (less of a special case) to start with a half open range ..< and then tackle the special case of ... (last value is special).