# Improving Float.random(in:using:)

Ah, of course! I just copied that from the paper without thinking ...

Here's the improved function:

func pseudorandomFloatInClosedUnitRange<R>(using prng: inout R) -> Float
where R: PseudorandomNumberGenerator
{
// Start by choosing an exponent (with correct probability):
var exp = 126 &- prng.next().trailingZeroBitCount
if exp == 62 { exp &-= prng.next().trailingZeroBitCount }
if exp < 0 { exp = 0 }
// Note that `exp` be at most 126, not 127 (as in exp of 1). This is
// because we have to increment it with 50% probability if the random
// significand happens to be zero (see below).

// Choose a random 23-bit mantissa (and waste some rnd bits for now):
let rnd64 = prng.next()
let significand = UInt32(truncatingIfNeeded: rnd64) & 0x7FFFFF

// If it's zero, then with 50% probability we should move to
// the next exponent range:
if significand == 0 && (rnd64 >> 23) & 1 != 0 { exp &+= 1 }

return Float(bitPattern: (UInt32(exp) << 23) | significand)
}

The times per run reported by the test program (new version below) changed from about 0.11 seconds to 0.04 s (on my old MBP, one run consists of generating and adding 10 million Float values).

I added two simpler functions for comparison (that do not do what we want)

• One that uses mimics what's currently in Float.random(in:using:) but simplified to work only for 0 ..< 1 (while wasting 40 random bits per generated float)
• One that just returns 0 or 1 (while wasting 63 random bits).

If we use the zero-or-one function as a baseline, the improvement you mentioned made it (0.11 - 0.011) / (0.04 - 0.011) = 3.4 times faster.

Here's the complete updated program.
import func QuartzCore.CABase.CACurrentMediaTime

//-----------------------------------------------------------------------------
// MARK: - Some scaffolding and the PRNG we'll use
//-----------------------------------------------------------------------------

/// A type that can produce a reproducible stream of uniformly distributed
/// pseudorandom bits.
protocol PseudorandomNumberGenerator {

associatedtype State
var state: State { get }

/// Creates a new instance of this generator with a state that is
/// completely determined by the given seed. Any `UInt64` is allowed and
/// maps to a valid state. Each seed may or may not map to a unique state
/// depending on the generator.
init(seed: UInt64)

/// Creates a new instance of this generator with the given state.
/// The initializer fails if the given state is invalid for this generator.
init?(state: State)

/// Advances the state of the generator and returns the next 64 uniformly
/// distributed pseudorandom bits.
mutating func next() -> UInt64
}

extension PseudorandomNumberGenerator {

/// Creates a new instance of this generator by calling the required seeded
/// initializer with the given `seed` or a truly random seed if given `nil`.
init(seed: UInt64? = nil) {
let seed = seed ?? UInt64.random(in: .min ... .max)
self.init(seed: seed)
}
}

/// The wyrand generator, translated from:
/// https://github.com/wangyi-fudan/wyhash
struct WyRand : PseudorandomNumberGenerator {

typealias State = UInt64

private (set) var state: State

init(seed: UInt64) {
self.state = seed
}
init(state: State) {
// Every UInt64 value is a valid WeRand state, so no need for this
// initializer to be failable, it will satisfy the requirement anyway.
self.state = state
}

mutating func next() -> UInt64 {
state &+= 0xa0761d6478bd642f
let mul = state.multipliedFullWidth(by: state ^ 0xe7037ed1a0b428db)
return mul.high ^ mul.low
}
}

//-----------------------------------------------------------------------------
// MARK: - The function we're trying to write
//-----------------------------------------------------------------------------

/// This function chooses a pseudorandom value from a continuous uniform
/// distribution within the closed range `0 ... 1` and then convert that value
/// to the nearest representable Float value.
///
/// It uses the method described in:
/// http://allendowney.com/research/rand/downey07randfloat.pdf
///
/// # TODO:
///
///  * Check that it work as intended. I have not analyzed it more than
///    by using the simple `test()` function at the end of this program.
///
///  * Make it work for any other range, eg
///    `0 ..< 1` (half open instead of closed) and
///    `-Float.greatestFiniteMagnitude ... Float.greatestFiniteMagnitude`.
///
/// - Parameters:
///   - prng: The pseudorandom number generator to use.
///
func pseudorandomFloatInClosedUnitRange<R>(using prng: inout R) -> Float
where R: PseudorandomNumberGenerator
{
// Start by choosing an exponent (with correct probability):
var exp = 126 &- prng.next().trailingZeroBitCount
if exp == 62 { exp &-= prng.next().trailingZeroBitCount }
if exp < 0 { exp = 0 }
// Note that `exp` be at most 126, not 127 (as in exp of 1). This is
// because we have to increment it with 50% probability if the random
// significand happens to be zero (see below).

// Choose a random 23-bit mantissa (and waste some rnd bits for now):
let rnd64 = prng.next()
let significand = UInt32(truncatingIfNeeded: rnd64) & 0x7FFFFF

// If it's zero, then with 50% probability we should move to
// the next exponent range:
if significand == 0 && (rnd64 >> 23) & 1 != 0 { exp &+= 1 }

return Float(bitPattern: (UInt32(exp) << 23) | significand)
}

//-----------------------------------------------------------------------------
// MARK: - Two functions that are similar but not what we want, for comparison
//-----------------------------------------------------------------------------

/// This function returns a value in the half open range `0 ..< 1` using the
/// method that is currently (2020-01-30) in the stdlib, ie the one which will
/// only choose between 2**24 values evenly spaced across the range, 0.5.ulp
/// apart, meaning many representable values will never be returned.
///
/// This is probably sufficient for most purposes (might even be preferrable
/// since it's more efficient). But there are cases where it's nice to be able
/// to produce *any* representable `Float` value within any finite range.
///
/// - Parameters:
///   - prng: The pseudorandom number generator to use.
///
func pseudorandomFloatInHalfOpenUnitInterval<R>(using prng: inout R) -> Float
where R: PseudorandomNumberGenerator
{
return Float(prng.next() &>> 40) * (Float.ulpOfOne/2)
}

/// This function simply returns 0 or 1 with equal probability.
///
/// - Parameters:
///   - prng: The pseudorandom number generator to use.
///
func pseudorandomFloatZeroOrOne<R>(using prng: inout R) -> Float
where R: PseudorandomNumberGenerator
{
return Float(prng.next() & 1)
}

//-----------------------------------------------------------------------------
// MARK: - Demo
//-----------------------------------------------------------------------------

func test() {
var prng = WyRand()
let numRuns = 10
let floatsPerRun = 10_000_000
var meanSum = Double(0)
var minTimePerRun = Double.infinity
for _ in 0 ..< numRuns {
let t0 = CACurrentMediaTime()
var sum = Double(0)
for _ in 0 ..< floatsPerRun {

// Uncomment the function to test here:

// The function that returns *any* representable value in [0...1]:
let v = pseudorandomFloatInClosedUnitRange(using: &prng)
// 0.036 seconds per run on my machine.

// The function that returns *some* representable value in [0..<1]:
// let v = pseudorandomFloatInHalfOpenUnitInterval(using: &prng)
// 0.013 seconds per run on my machine.

// The function that returns 0 or 1, used as a baseline:
// let v = pseudorandomFloatZeroOrOne(using: &prng)
// 0.011 seconds per run on my machine.

sum += Double(v)
}
let t1 = CACurrentMediaTime()
let timeForThisRun = t1 - t0
minTimePerRun = min(minTimePerRun, timeForThisRun)
print(t1 - t0, "s | Mean float value: \(sum / Double(floatsPerRun))")
meanSum += sum
}
print("Total mean float value:", meanSum / Double(floatsPerRun * numRuns))
print("Minimum time per run:", minTimePerRun)
}
test()
3 Likes

I’m going off-topic here, and I’m not an expert on benchmarking, but I have to wonder if QuartzCore.CABase.CACurrentMediaTime is really the recommended way to measure time. I don’t see any alternative in the standard library, but Foundation does have a clock() function.

Is there some reason not to use Foundation.clock()?

FWIW, my default choice for benchmarking on Darwin is mach_absolute_time(). It's slightly quirky in that you have to query mach_timebase_info(...) to convert the result to ns, but a lovely lightweight thing otherwise.

Benchmarks I’ve seen usually show CACurrentMediaTime as the fastest option that doesn’t require conversion like mach_absolute_time. I usually use CFAbsoluteTimeGetCurrent() since it doesn’t require a separate import.

I've "investigated" this a number of times and landed in CACurrentMediaTime which if I remember correctly is essentially a wrapper around mach_absolute_time(), doing the conversion from CPU ticks to seconds for you, and thereby being some fraction of a microsecond slower than mach_absolute_time(). But I'm happy to change my habit if there's a problem with it. I doubt that the efficiency and/or accuracy of the timing function would make much difference in this particular case though. According to my experience, optimization glitches depending on seemingly unrelated context in various crazy ways is still much more of an issue in this type of benchmarking.

Oh I wasn’t doubting the accuracy, just questioning the verbosity and discoverability. If Foundation.clock() isn’t good enough, then Swift should either fix it or add something that is good enough with an equally appealing spelling.

1 Like

Possible alternatives to mach_absolute_time():

• clock_gettime_nsec_np(CLOCK_UPTIME_RAW) — a non-portable Darwin extension in <time.h>

• DispatchTime.now().uptimeNanoseconds

• ProcessInfo.processInfo.systemUptime — except the implementation for Windows uses a lower-resolution clock?

I think Foundation.clock() is re-exported from the standard C library.

It measures processor time, used by the calling process, in microseconds.

import Darwin

clock()

clock_gettime_nsec_np(CLOCK_PROCESS_CPUTIME_ID) / 1000

IHMO, you should never use system clock for benchmark, and always use a monotonic clock.

System time has no guarantee to be monotonic, and is usually less accurate and far slower to query.

Which means that anything that rely on system clock should be ban from benchmarking (ie: CFAbsoluteTimeGetCurrent).

Few benchmarks need to worry about that. Any jumps in the system clock should be both obvious and smoothed by repeated runs. In any case, I wasn’t recommending a time source for highly sensitive benchmarks, just a convenient one that’s performant.

I went ahead and created a separate thread for the timing discussion. : )

2 Likes

I made an attempt at generalizing and genericizing your function, so it will work with any floating-point type that meets the constraints, and it can use any powers of 2 (as well as 0) for its bounds.

The call-sites look like this:

let y = Double.random(inBinades: 0.25...0x1p16)     // Any powers of 2 as bounds
let z = Float80.random(inBinades: 0 ... .infinity)  // The whole shebang
extension BinaryFloatingPoint where RawSignificand: FixedWidthInteger, RawExponent: FixedWidthInteger {
var generator = SystemRandomNumberGenerator()
}

static func random<T: RandomNumberGenerator>(inBinades binades: ClosedRange<Self>, using generator: inout T) -> Self {
precondition(low >= 0)
precondition(low.significandBitPattern == 0)
precondition(high.significandBitPattern == 0)

let (lowExp, highExp) = (low.exponentBitPattern, high.exponentBitPattern)
if lowExp == highExp { return low }

let significand = RawSignificand.random(in: 0 ... (1 << significandBitCount) &- 1, using: &generator)
var exponent = highExp &- lowExp

if lowExp != 0 && exponent < UInt64.bitWidth {
let r = UInt64.random(in: 1 ... (1 &<< exponent) &- 1, using: &generator)
let n = UInt64.bitWidth &- r.leadingZeroBitCount
exponent = numericCast(n)
} else {
while exponent > 0 {
let r = generator.next()
if r != 0 { break }
}
}

if significand == 0 && Bool.random(using: & generator) { exponent &+= 1 }
if exponent == 0 {
if lowExp == 0 { exponent = 1 }
}
exponent &+= lowExp &- 1

return Self(sign: .plus, exponentBitPattern: exponent, significandBitPattern: significand)
}
}

The expression 1 << significandBitCount could technically overflow if a floating-point type uses all of the bits in its RawSignificand (ie. if significandBitCount == RawSignificand.bitWidth), but none of the standard types do so.

I think I’ve avoided even that possibility now.

Edit: made some improvements to the implementation.

2 Likes

I think you need to pass the generator along to Bool.

Yes, or just use a bit, to avoid coupling this implementation with that of Bool.random(...).

Thanks, I just fixed that, and one other thing.

There’s probably a more elegant way to write this, I was just hacking on it last night for fun.

And I’m still hoping there’s a good way to make it work for arbitrary intervals, not just between binades.

1 Like

That would be very cool!

One thing that I cannot quite understand about the method we use (as described in the paper), is why the values of the lower and upper bound of the range should occur with 50% lower probability than all the other values in between.

I might see how/why it makes sense at in-range exponent boundaries, but why should it make sense for the lowest and highest values of the full range, and how does it make sense at the exponent boundary between subnormals and normals (these two have the same .ulp, which is not the case for any other exponents)?

It's easier to explain with an example:

Let's say we only want to generate values within a single exponent range, the one with .ulp == 1, ie these values:

8388608.0 == Float(1 << 23) // This has .ulp == 1.0, and so have all these:
8388609.0 == Float(1 << 23).nextUp
8388610.0 == Float(1 << 23).nextUp.nextUp
8388611.0 == ...
...
16777215.0 == Float(1 << 24).nextDown == Float((1 << 24) - 1)
16777216.0 == Float(1 << 24) // (But this last one has ulp == 2.0 ...)

And we build a histogram (or rather just the lower and upper ends of a histogram (because the whole histogram would take up ((1 << 23) + 1)*8 == 64 MB + 8 Bytes), like so:

func test() {
var generator = WyRand()
let floatCount = ((1 << 23) + 1) * 500 // Meaning about 500 samples/bin, no?
var lowerBins = [Int](repeating: 0, count: 16)
var upperBins = [Int](repeating: 0, count: 16)
for i in 0 ..< floatCount {
let v = Float.random(inBinades: Float(1 << 23) ... Float(1 << 24),
using: &generator)
let binIndex = Int(v)
if binIndex < (1 << 23) + 16 { lowerBins[binIndex - (1 << 23)] += 1 }
if binIndex >= (1 << 24) - 15 { upperBins[binIndex - ((1 << 24) - 15)] += 1 }
if i & ((1 << 28) &- 1) == 0 {
print("Count down: ", (floatCount - (i + 1)) / (1 << 28), "…")
}
}
print("Lower 16 bins:")
for i in 0 ..< 16 { print(i + (1 << 23), lowerBins[i]) }
print("Upper 16 bins:")
for i in 0 ..< 16 { print(i + ((1 << 24) - 15), upperBins[i]) }
}

Then we can see what I mean:

Lower 16 bins:
8388608 220
8388609 475
8388611 507
8388612 483
8388613 477
8388614 521
...
...
Upper 16 bins:
...
...
16777211 485
16777212 478
16777210 529
16777214 467
16777215 508
16777216 242

The first and last value in the range will occur only half as often as the other values.

Why shouldn't all values occur about 500 times?

And I don't like that the method seems to be built around including the upper bound value, it would feel better (less of a special case) to start with a half open range ..< and then tackle the special case of ... (last value is special).

1 Like

That’s a question of what semantics we desire.

Imagine a true uniform distribution on the closed interval [1, 2].

Let’s choose a real number uniformly at random from that distribution, call it x.

Now if we round x to the nearest representable Float, each representable float (call it f) strictly inside the interval has a “basin” of width Float.ulpOfOne centered on it where if x falls in that basin, then x rounds to f.

But the values 1 and 2 each only have half of their basins within the interval, so they are half as likely to be generated as the floats between them.

Conversely, if we were drawing from the half-open interval [1, 2), then each float’s basin would be [f, f.nextUp), and they would all be equally likely (with 2 impossible.)

1 Like

Thanks, that cleared it up for me (I thought it had to do with the different distance between represented values). I guess it also makes sense when we have a range that eg spans symmetrically over negative and positive values. We can then set the sign bit (depending on a random bit) and the -0s and 0s won't add up to make zero twice as probable as it should be.

Yes, and I think the half open range feels like a simpler base case to build a more general version on.

This also implies that the following would compile and return any finite non-negative Float value:

let anyNonNegativeFiniteValue = Float.random(in: 0 ..< .infinity)

but the current implementation results in this (IMO invalid) runtime error:

Fatal error: There is no uniform distribution on an infinite range

That range is not infinite, it is the finite range of all values (greater than or equal to zero and) less than infinity (Float.infinity.nextDown == Float.greatestFiniteMagnitude). It is the same range as this:

let anyNonNegativeFiniteValue = Float.random(in: 0 ... .greatestFiniteMagnitude)

(which runs without any runtime error.)

So (@scanon and @Nevin), am I correct in assuming that an improved
Float.random(in: using: )
should accept and behave identically for eg:

let anyNonNegativeFiniteValue1 = Float.random(in: 0 ..< .infinity)
let anyNonNegativeFiniteValue2 = Float.random(in: 0 ... .greatestFiniteMagnitude)
let anyFiniteValue1 = Float.random(in: -.greatestFiniteMagnitude ..< .infinity)
let anyFiniteValue2 = Float.random(in: -.greatestFiniteMagnitude ... .greatestFiniteMagnitude)

?

3 Likes

Another thing regarding this, I'm actually not sure I understand why 1 should have twice the probability of being chosen when we draw from [1, 2) than when we draw from [1, 2].
Or put differently, why should the basin of each Float value change between (A):

•     •     •     •     •     •     •     •
⎣____⎠⎣____⎠⎣____⎠
|     |     |
|     |     '--- v.nextUp
|     '----------v
'----------------v.nextDown

and (B):

•     •     •     •     •     •     •     •
⎣____⎠⎣____⎠⎣____⎠
|     |     |
|     |     '--- v.nextUp
|     '----------v
'----------------v.nextDown

as we change between
(A) drawing from a closed interval
and
(B) drawing from a half open interval.

Here's how I think about it:

The boundaries of the interval we are drawing from can only be specified using representable Float values, so drawing from
x ...< y should essentially be equivalent to drawing from
x ... y.nextDown
and vice versa; drawing from
x ... y should essentially be equivalent to drawing from
x ...< y.nextUp
so in the following code example (assuming the improved Float.random(in: using:)), the precondition should be satisfied:

let (prngA, prngB) = (MyPrng(seed: 1234), MyPrng(seed: 1234))
for _ in 0 ..< 10_000_000 {
let rA = Float.random(in: 5 ..< 6, using: &prngA)
let rB = Float.random(in: 5 ... 6.nextDown, using: &prngB)
precondition(rA == rB)
}

which means the ClosedRange<Float>-variant could be implemented by just letting it call the Range<Float> variant with upperBound.nextUp.
(remember that Float.infinity.nextDown == Float.greatestFiniteMagnitude and Float.greatestFiniteMagnitude.nextUp == Float.infinity, and thus .infinity would be a valid upper bound only for the Range<Float>-variant.)

I guess I'm missing something?

1 Like