Improving Float.random(in:using:)

Benchmarks I’ve seen usually show CACurrentMediaTime as the fastest option that doesn’t require conversion like mach_absolute_time. I usually use CFAbsoluteTimeGetCurrent() since it doesn’t require a separate import.

I've "investigated" this a number of times and landed in CACurrentMediaTime which if I remember correctly is essentially a wrapper around mach_absolute_time(), doing the conversion from CPU ticks to seconds for you, and thereby being some fraction of a microsecond slower than mach_absolute_time(). But I'm happy to change my habit if there's a problem with it. I doubt that the efficiency and/or accuracy of the timing function would make much difference in this particular case though. According to my experience, optimization glitches depending on seemingly unrelated context in various crazy ways is still much more of an issue in this type of benchmarking.

Oh I wasn’t doubting the accuracy, just questioning the verbosity and discoverability. If Foundation.clock() isn’t good enough, then Swift should either fix it or add something that is good enough with an equally appealing spelling.

1 Like

Possible alternatives to mach_absolute_time():

  • clock_gettime_nsec_np(CLOCK_UPTIME_RAW) — a non-portable Darwin extension in <time.h>

  • DispatchTime.now().uptimeNanoseconds

  • ProcessInfo.processInfo.systemUptime — except the implementation for Windows uses a lower-resolution clock?

I think Foundation.clock() is re-exported from the standard C library.

It measures processor time, used by the calling process, in microseconds.

import Darwin

clock()

clock_gettime_nsec_np(CLOCK_PROCESS_CPUTIME_ID) / 1000

IHMO, you should never use system clock for benchmark, and always use a monotonic clock.

System time has no guarantee to be monotonic, and is usually less accurate and far slower to query.

Which means that anything that rely on system clock should be ban from benchmarking (ie: CFAbsoluteTimeGetCurrent).

Few benchmarks need to worry about that. Any jumps in the system clock should be both obvious and smoothed by repeated runs. In any case, I wasn’t recommending a time source for highly sensitive benchmarks, just a convenient one that’s performant.

I went ahead and created a separate thread for the timing discussion. : )

2 Likes

I made an attempt at generalizing and genericizing your function, so it will work with any floating-point type that meets the constraints, and it can use any powers of 2 (as well as 0) for its bounds.

The call-sites look like this:

let x = Float.random(inBinades: 0...1)              // Equivalent to your function
let y = Double.random(inBinades: 0.25...0x1p16)     // Any powers of 2 as bounds
let z = Float80.random(inBinades: 0 ... .infinity)  // The whole shebang
Generic `random(inBinades:)` implementation
extension BinaryFloatingPoint where RawSignificand: FixedWidthInteger, RawExponent: FixedWidthInteger {
  static func random(inBinades binades: ClosedRange<Self>) -> Self {
    var generator = SystemRandomNumberGenerator()
    return random(inBinades: binades, using: &generator)
  }
  
  static func random<T: RandomNumberGenerator>(inBinades binades: ClosedRange<Self>, using generator: inout T) -> Self {
    let (low, high) = (binades.lowerBound, binades.upperBound)
    precondition(low >= 0)
    precondition(low.significandBitPattern == 0)
    precondition(high.significandBitPattern == 0)
    
    let (lowExp, highExp) = (low.exponentBitPattern, high.exponentBitPattern)
    if lowExp == highExp { return low }
    
    let significand = RawSignificand.random(in: 0 ... (1 << significandBitCount) &- 1, using: &generator)
    var exponent = highExp &- lowExp
    
    if lowExp != 0 && exponent < UInt64.bitWidth {
      let r = UInt64.random(in: 1 ... (1 &<< exponent) &- 1, using: &generator)
      let n = UInt64.bitWidth &- r.leadingZeroBitCount
      exponent = numericCast(n)
    } else {
      while exponent > 0 {
        let r = generator.next()
        exponent &-= min(exponent, numericCast(r.leadingZeroBitCount))
        if r != 0 { break }
      }
    }
    
    if significand == 0 && Bool.random(using: & generator) { exponent &+= 1 }
    if exponent == 0 {
      if lowExp == 0 { exponent = 1 }
      else { return random(inBinades: binades, using: &generator) }
    }
    exponent &+= lowExp &- 1
    
    return Self(sign: .plus, exponentBitPattern: exponent, significandBitPattern: significand)
  }
}

The expression 1 << significandBitCount could technically overflow if a floating-point type uses all of the bits in its RawSignificand (ie. if significandBitCount == RawSignificand.bitWidth), but none of the standard types do so.

I think I’ve avoided even that possibility now.

Edit: made some improvements to the implementation.

2 Likes

I think you need to pass the generator along to Bool.

Yes, or just use a bit, to avoid coupling this implementation with that of Bool.random(...).

Thanks, I just fixed that, and one other thing.

There’s probably a more elegant way to write this, I was just hacking on it last night for fun.

And I’m still hoping there’s a good way to make it work for arbitrary intervals, not just between binades.

1 Like

That would be very cool!

One thing that I cannot quite understand about the method we use (as described in the paper), is why the values of the lower and upper bound of the range should occur with 50% lower probability than all the other values in between.

I might see how/why it makes sense at in-range exponent boundaries, but why should it make sense for the lowest and highest values of the full range, and how does it make sense at the exponent boundary between subnormals and normals (these two have the same .ulp, which is not the case for any other exponents)?


It's easier to explain with an example:

Let's say we only want to generate values within a single exponent range, the one with .ulp == 1, ie these values:

  8388608.0 == Float(1 << 23) // This has .ulp == 1.0, and so have all these:
  8388609.0 == Float(1 << 23).nextUp
  8388610.0 == Float(1 << 23).nextUp.nextUp
  8388611.0 == ...
        ...
 16777215.0 == Float(1 << 24).nextDown == Float((1 << 24) - 1)
 16777216.0 == Float(1 << 24) // (But this last one has ulp == 2.0 ...)

And we build a histogram (or rather just the lower and upper ends of a histogram (because the whole histogram would take up ((1 << 23) + 1)*8 == 64 MB + 8 Bytes), like so:

func test() {
    var generator = WyRand()
    let floatCount = ((1 << 23) + 1) * 500 // Meaning about 500 samples/bin, no?
    var lowerBins = [Int](repeating: 0, count: 16)
    var upperBins = [Int](repeating: 0, count: 16)
    for i in 0 ..< floatCount {
        let v = Float.random(inBinades: Float(1 << 23) ... Float(1 << 24),
                             using: &generator)
        let binIndex = Int(v)
        if binIndex < (1 << 23) + 16 { lowerBins[binIndex - (1 << 23)] += 1 }
        if binIndex >= (1 << 24) - 15 { upperBins[binIndex - ((1 << 24) - 15)] += 1 }
        if i & ((1 << 28) &- 1) == 0 {
            print("Count down: ", (floatCount - (i + 1)) / (1 << 28), "…")
        }
    }
    print("Lower 16 bins:")
    for i in 0 ..< 16 { print(i + (1 << 23), lowerBins[i]) }
    print("Upper 16 bins:")
    for i in 0 ..< 16 { print(i + ((1 << 24) - 15), upperBins[i]) }
}

Then we can see what I mean:

Lower 16 bins:
8388608 220
8388609 475
8388611 507
8388612 483
8388613 477
8388614 521
...
...
Upper 16 bins:
...
...
16777211 485
16777212 478
16777210 529
16777214 467
16777215 508
16777216 242

The first and last value in the range will occur only half as often as the other values.

Why shouldn't all values occur about 500 times?

And I don't like that the method seems to be built around including the upper bound value, it would feel better (less of a special case) to start with a half open range ..< and then tackle the special case of ... (last value is special).

1 Like

That’s a question of what semantics we desire.

Imagine a true uniform distribution on the closed interval [1, 2].

Let’s choose a real number uniformly at random from that distribution, call it x.

Now if we round x to the nearest representable Float, each representable float (call it f) strictly inside the interval has a “basin” of width Float.ulpOfOne centered on it where if x falls in that basin, then x rounds to f.

But the values 1 and 2 each only have half of their basins within the interval, so they are half as likely to be generated as the floats between them.

Conversely, if we were drawing from the half-open interval [1, 2), then each float’s basin would be [f, f.nextUp), and they would all be equally likely (with 2 impossible.)

1 Like

Thanks, that cleared it up for me (I thought it had to do with the different distance between represented values). I guess it also makes sense when we have a range that eg spans symmetrically over negative and positive values. We can then set the sign bit (depending on a random bit) and the -0s and 0s won't add up to make zero twice as probable as it should be.

Yes, and I think the half open range feels like a simpler base case to build a more general version on.

This also implies that the following would compile and return any finite non-negative Float value:

let anyNonNegativeFiniteValue = Float.random(in: 0 ..< .infinity)

but the current implementation results in this (IMO invalid) runtime error:

Fatal error: There is no uniform distribution on an infinite range

That range is not infinite, it is the finite range of all values (greater than or equal to zero and) less than infinity (Float.infinity.nextDown == Float.greatestFiniteMagnitude). It is the same range as this:

let anyNonNegativeFiniteValue = Float.random(in: 0 ... .greatestFiniteMagnitude)

(which runs without any runtime error.)


So (@scanon and @Nevin), am I correct in assuming that an improved
Float.random(in: using: )
should accept and behave identically for eg:

let anyNonNegativeFiniteValue1 = Float.random(in: 0 ..< .infinity)
let anyNonNegativeFiniteValue2 = Float.random(in: 0 ... .greatestFiniteMagnitude)
let anyFiniteValue1 = Float.random(in: -.greatestFiniteMagnitude ..< .infinity)
let anyFiniteValue2 = Float.random(in: -.greatestFiniteMagnitude ... .greatestFiniteMagnitude)

?

3 Likes

Another thing regarding this, I'm actually not sure I understand why 1 should have twice the probability of being chosen when we draw from [1, 2) than when we draw from [1, 2].
Or put differently, why should the basin of each Float value change between (A):

   •     •     •     •     •     •     •     •
⎣____⎠⎣____⎠⎣____⎠
   |     |     |
   |     |     '--- v.nextUp
   |     '----------v
   '----------------v.nextDown

and (B):

   •     •     •     •     •     •     •     •
   ⎣____⎠⎣____⎠⎣____⎠
   |     |     |
   |     |     '--- v.nextUp
   |     '----------v
   '----------------v.nextDown

as we change between
(A) drawing from a closed interval
and
(B) drawing from a half open interval.


Here's how I think about it:

The boundaries of the interval we are drawing from can only be specified using representable Float values, so drawing from
x ...< y should essentially be equivalent to drawing from
x ... y.nextDown
and vice versa; drawing from
x ... y should essentially be equivalent to drawing from
x ...< y.nextUp
so in the following code example (assuming the improved Float.random(in: using:)), the precondition should be satisfied:

let (prngA, prngB) = (MyPrng(seed: 1234), MyPrng(seed: 1234))
for _ in 0 ..< 10_000_000 {
    let rA = Float.random(in: 5 ..< 6, using: &prngA)
    let rB = Float.random(in: 5 ... 6.nextDown, using: &prngB)
    precondition(rA == rB)
}

which means the ClosedRange<Float>-variant could be implemented by just letting it call the Range<Float> variant with upperBound.nextUp.
(remember that Float.infinity.nextDown == Float.greatestFiniteMagnitude and Float.greatestFiniteMagnitude.nextUp == Float.infinity, and thus .infinity would be a valid upper bound only for the Range<Float>-variant.)

I guess I'm missing something?

1 Like

Excellent thoughts Jens, I’m glad you brought this up. I’ve actually been working on an implementation, and the first draft is done and working. Once I finish adding documentation, I’m going to post it and ask for feedback (especially from Steve).

And yes, I had many of the same thoughts as you regarding the implementation details. I intend to start an Evolution Discussion thread to ask for input.

My first-draft only handles Range not ClosedRange, though as you note the latter could be built from the former if we choose compatible semantics. It currently behaves as, “Pick a real number in the range and round down to the next representable value”. And it does indeed work with an upper bound of infinity (treating it as one ulp beyond greatestFiniteMagnitude).

As for small ranges, the test case that really made me think was a range just 2 ulps wide, entirely within a single binade, such as 1.5.nextDown ..< 1.5.nextUp. The current documentation and implementation in the Standard Library make the middle value (1.5) occur twice as often as the lower bound, ie. probabilities of [1/3, 2/3]. (And for a closed range, the middle value also occurs twice as often as the upper bound, ie. [1/4, 1/2, 1/4].)

I think that is unintuitive and generally not what people want or expect. There are 3 “simple” alternatives we could choose:

  1. p(x) is proportional to x.ulp
  2. p(x) is proportional to x.nextUp - x
  3. p(x) is proportional to (x.nextUp - x.nextDown) / 2

For the vast majority of numbers, these are all equivalent. The differences only show up at the boundaries of binades, specifically for non-zero values with 0 significand.

Now, #1 has the advantage that all values with the same raw exponent, have the same probability of occurring. It seems conceptually the simplest to explain.

#2 is the “obvious” answer for half-open ranges, namely choosing a real number at random then rounding down to a representable value. However it makes negative numbers with 0 significand half as likely as their positive counterparts.

And #3 embodies the spirit of “round to nearest”, while avoiding edge effects by treating each bound as if its whole “basin of attraction” were included. However, it makes all non-zero numbers with 0 significand 75% as likely as the other numbers with the same raw exponent.

In every case, we could treat infinity as one ulp beyond greatestFiniteMagnitude, and also never produce negative zero.

1 Like

0 ..< .infinity is an infinite range. The uniform distribution in question is a uniform distribution of real numbers, not a uniform distribution of IEEE-754 values. If you want 0 ... .greatestFiniteMagnitude, why wouldn't you say that?

EDIT: for a finite example, consider choosing a random uniformly distributed decimal number in 0..<1 with one digit of precision, vs. 0...0.9. There are a lot of complications to consider, but you wouldn't expect the latter range to have nearly as many 0.9s in it as the former range.

3 Likes

…because I haven’t finished implementing the closed-range version of random. :-)