Improving Float.random(in:using:)

Jens · March 4, 2020, 10:20pm

I'm probably missing something, but for any Float values x and y, I think the following precondition holds:

let a = x < y
let b = x <= y.nextDown
precondition(a == b)

But anyway, I'm willing to stop bothering everyone with my novice questions and speculations and leave this to the experts : ).

xwu · March 4, 2020, 10:42pm

Jens:

I'm probably missing something, but for any Float values x and y , I think the following precondition holds:
let a = x < y
let b = x <= y.nextDown
precondition(a == b)
But I'm willing to stop bothering everyone with my novice questions and speculations and leave this to the experts : ).

Let's step back a minute and consider the distinction again discussed by @scanon:

scanon:

You guys are collectively dancing around an important semantic distinction in "what does it mean to select a random floating-point number?"

It can mean "select a random real number according to a specified distribution, then round that real number to a representable floating-point value according to a specified rounding rule."

Or it can mean "select a random floating-point value according to a specified discrete distribution."

Both of these are perfectly legitimate viewpoints, but the choice of viewpoint (and the choice of rounding rule in the first case) lead to different results.

If you adopt the first viewpoint, random(in: 0 ..< .infinity) is obviously ill-formed. If you adopt the second viewpoint, you can make it well-formed if you squint hard enough.

My opinion is that the first viewpoint is the sound and useful one. It matches the semantics of other IEEE 754 operations much more closely.

Years ago (had I more time, I would link to the post after trawling the search function), I asked the core team a question: is it the intention of Swift's floating-point types to model the reals, or to model the discrete and finite subset of the reals that are representable?

The answer given was that, indeed, what we are modeling is the reals. That is to say, the semantics of operations defined on these types will, whenever possible, be such that they give the closest representable value (by some rounding rule) to a real number computed with infinite precision. (Consider, for instance, the semantics of fused multiply-add.)

Now, of course, this modeling is by definition imperfect, since of course we have only finite precision and need to perform this work within a reasonable amount of time. But, imperfect or no, this is what we're modeling.

Consider the case of (1.0 as Float)...2.0. It is true that there are a finite number of values that are exactly representable within this range, but it models an interval containing uncountably many real numbers. It is not possible to write for i in (1.0 as Float)...2.0 { ... }; that is to say, we do not treat this as a "countable range." (Indeed, before conditional conformances, we had CountableRange as a separate type from Range.) Put another way, semantically, (1.0 as Float)...2.0 is not the same as (1.0 as Float)..<(2.0 as Float).nextUp because there are infinitely many real numbers contained in the latter but not the former; that they aren't representable is beside the point.

Jens · March 5, 2020, 12:08am

I get everything you’re saying except I would think both contained infinitely many real numbers. Why wouldn’t the closed range (or any interval that is not empty or degenerate) contain infinitely many real numbers, ie be uncountable?

Nevin · March 5, 2020, 12:28am

Parse it the other way Jens. He’s talking about the set of real numbers which are both inside the second range and outside the first range.

Jens · March 5, 2020, 12:36am

Oh, but in that case, isn’t the set of real numbers that are in [1,2] but not in [1,2), ie their difference, the degenerate set of the single number 2?

I should probably just go to sleep now.

EDIT: yup, I missed the .nextUp, sorry for the noise.

sveinhal · March 5, 2020, 6:50am

There are infinitely many reals in .greatestFiniteMagnitude..<.infinity

Although they are all rounded to .greatestFiniteMagnitude, there should be a lot more of them in to choose from.

However, .greatestFiniteMagnitude...(.greatestFiniteMagnitude) contains only one real.

Non?

Jens · March 5, 2020, 7:58am

Yes, and I think I see now what @scanon meant by

there is no such thing as a uniform distribution on [a, ∞). It's ill-formed

That is, even though the interval [a, ∞) is meaningful and denotes the set of all real numbers greater than or equal to a, it is not meaningful to talk about a uniform distribution on that interval.

Yes. That is a great way to show why the two are different (which I finally see now ). That is:
a ... a is clearly not the same as a ..< a.nextUp, because the former represents an interval that contains only a single real number, a, while the latter represents an interval that contains a and infinitely more real numbers. The only thing that is "the same" about them is that both happen to contain only one (and the same) number, a, that is exactly representable as a Float value, a.

And, for any

let r1 = a ... b
let r2 = a ..< b.nextUp

where a is less than b and both are finite Float values:

Both corresponding intervals [a, b] and [a, b.nextUp) contains infinitely many real numbers.
But [a, b.nextUp) contains infinitely many real numbers that [a, b] does not contain, namely those in the open interval (b, b.nextUp).
Both happen to contain exactly the same set of representable Float values.

Jens · March 5, 2020, 12:05pm

I and possibly other non-expert readers of this thread could probably use some illustrations.

Let's use a hypothetical tiny binary floating point type Float5.

Detailed description here.

1 sign bit, 2 exponent bits, 2 significand bits
exponent bias 1

0_00_00                        = 0.0
0_00_01 = 2**(0) * (0 + 1/4.0) = 0.25   least nonzero magnitude
0_00_10 = 2**(0) * (0 + 2/4.0) = 0.5
0_00_11 = 2**(0) * (0 + 3/4.0) = 0.75
0_01_00 = 2**(0) * (1 + 0/4.0) = 1      least normal magnitude
0_01_01 = 2**(0) * (1 + 1/4.0) = 1.25
0_01_10 = 2**(0) * (1 + 2/4.0) = 1.5
0_01_11 = 2**(0) * (1 + 3/4.0) = 1.75
0_10_00 = 2**(1) * (1 + 0/4.0) = 2.0
0_10_01 = 2**(1) * (1 + 1/4.0) = 2.5
0_10_10 = 2**(1) * (1 + 2/4.0) = 3.0
0_10_11 = 2**(1) * (1 + 3/4.0) = 3.5    geatest finite magnitude
0_11_00                        = inf
0_11_01                        = snan
0_11_10                        = nan
0_11_11                        = nan
1_00_00                        = -0.0
1_00_01                        = -0.25
1_00_10                        = -0.5
1_00_11                        = -0.75
1_01_00                        = -1
1_01_01                        = -1.25
1_01_10                        = -1.5
1_01_11                        = -1.75
1_10_00                        = -2.0
1_10_01                        = -2.5
1_10_10                        = -3.0
1_10_11                        = -3.5
1_11_00                        = -inf
1_11_01                        = -snan
1_11_10                        = -nan
1_11_11                        = -nan

Not counting zero twice, it can represent 23 different finite values, the smallest nonzero being
0.25 (0b0_00_01) and the largest being 3.5 (0b0_10_11):

float5

Zooming in, and showing three different ranges:

let r1 = Float5(1) ..< Float5(2)
let r2 = Float5(1) ... Float5(2)
let r3 = Float5(1) ..< Float5(2).nextUp

as well as the "funnels" in which each real number that is drawn from any range will fall and be guided (rounded) to the closest representable Float5 value:

float5_2

Now, one can ask questions like:

What should happen with real values that fall exactly on the point where two funnels meet, ie half-way between each representable Float5 value?
(The standard way seems to be to choose the value with the most trailing zeroes in its bit pattern (ie tie to even?).)
- Does it make sense to let the rounding behavior on the negative side be a mirror-image of the positive side or not?
What should be valid and invalid ranges for (Float.random(in: using:)?
Does the funnels (as shown) / rounding even make sense for half-open ranges, for example r3 = 1 ..< 2.nextUp? I mean, what should happen with all the real numbers that are funneled to 2.5 (which the range excludes)?
What expectations should we have regarding the probabilities in various examples such as this:
```
let x: Float5 = someFloat5
let a = Float5.random(in: x ..< x.nextUp.nextUp.nextUp)
let b = Float5.random(in: a.upperBound ..< a.upperBound.nextUp.nextUp.nextUp)
let c = Float5.random(in: b.upperBound ..< b.upperBound.nextUp.nextUp.nextUp)

let abc = Float5.random(in: x ..< c.upperBound)
```
For example: I would expect that the probability of abc being some value v, is exactly the same probability as a, b or c being v.
Also, assuming all the three possible values of a have the same .ulp, should they have the same probability of being chosen? If so, does that also apply to a closed range that spans three possible representable values sharing the same ulp?

For half open ranges, rounding towards negative infinity as shown below seems to make more sense (assuming half open ranges makes sense at all in this context):

Jens · March 5, 2020, 2:00pm

Btw @Nevin: I noticed in your old code, that you use the following to check that the bounds are binades:

Assuming it is still relevant to your current implementation, and you haven't realized it yourself already: Note that the above precondition will not be satisfied for all binades, because there are subnormal binades, all of which has an exponent bit pattern of 0 and a significand bit pattern with a single bit set.

So AFAICS, to check that x > 0 is a binade / power of two, you should use eg x == x.binade instead.

Nevin · March 5, 2020, 3:24pm

I specifically care about numbers whose raw significand is identically 0.

Nevin · March 5, 2020, 3:49pm

Okay, here’s my first draft of an implementation: FloatingPointRandom.swift

It ended up being quite a bit longer than I anticipated, but I believe it works. I’ve only done basic sanity checks, but so far so good.

It’s written entirely in terms of basic integer operations (no multiplication or division), and at least conceptually I think the idea is reasonably straightforward. Hopefully someone can find a way to simplify the implementation (or a cleaner strategy overall).

• • •

For small ranges (basically, when the number of raw binades in the range is less than the number of spare bits in the RawSignificand, plus some edge cases), distance is counted in terms of the smallest ulp in the range, and a position chosen directly then converted back to a floating-point number. In this case, only a single random RawSignificand is needed.

For large ranges, the bounds are expanded to be equal, opposite, and have 0 significand (eg. ±2^k). That expanded range is divided into 2⁶⁴ equal sections, one of the sections that overlaps the original range is chosen, and then a value is selected within that section. If it’s inside the original range we’re done, otherwise try again.

For small types such as Float and Double, where the significandBitCount is noticeably less than 64 (realistically, below 60), there is a high likelihood that the entire section rounds to the same representable value, in which case we’re done and only a single random Int64 was needed.

Also, even in the large-range + large-type case where the range crosses 0, there are still only 2 sections (those with 0 as a bound) that require additional work beyond choosing a RawSignificand. All the other 2⁶⁴ - 2 sections fall within a single raw binade.

• • •

As an example, with Double (which has 52 significand bits), choosing a random number in -1.0 ..< 1.0 will use just a single random Int64 most of the time. Only 1 in 2048 times (when the result is within ±1/2048 of 0), does it need any additional work at all. And again the vast majority of the time, that extra work is just choosing a random RawSignificand (aka. UInt64.)

Even in a “maximally annoying” case like (-1.0 / 2048) ..< 0.5.nextUp, there is still only about a 1 in 512 chance of needing more than just the initial random Int64.

Jens · March 5, 2020, 8:52pm

Interesting approach!

I've done some quick testing and think I've found an issue: For some closed ranges it will never return the value of the upper bound.

And, of course (because of how the implementation works), this means that the corresponding open range has the corresponding issue.

It will behave as expected for eg Float(0.03125) ... Float(0.0625), but for eg
Float(0) ... Float(0.0625) it will never return 0.0625, even though that value (being a binade) should be twice as common as the values preceding it.

Here's a demonstration program:

func test() {
    var prng = WyRand()
    let range = Float(0.0) ... Float(0.0625)
    print("range.lowerBound.bitPattern:", range.lowerBound.bitPattern)
    print("range.upperBound.bitPattern:", range.upperBound.bitPattern)
    var hist = [UInt32 : UInt64]()
    var minValue = Float.infinity
    var maxValue = -Float.infinity
    for _ in 0 ..< 1024*1024*512 {
        let f = Float.uniformRandom(in: range, using: &prng)
        precondition(f >= range.lowerBound)
        precondition(f <= range.upperBound)
        minValue = min(minValue, f)
        maxValue = max(maxValue, f)
        // Store only the greatest 8 possible values in the histogram:
        if f.bitPattern > range.upperBound.bitPattern - 8 {
            hist[f.bitPattern, default: 0] &+= 1
        }
    }
    print("minValue:", minValue)
    print("maxValue:", maxValue)
    for key in hist.keys.sorted() {
        let v = hist[key]!
        print("bin for bit pattern \(key) has \(v) samples")
    }
}
test()

It will (after some time) print something like:

range.lowerBound.bitPattern: 0
range.upperBound.bitPattern: 1031798784
minValue: 6.0394516e-11
maxValue: 0.062499996
bin for bit pattern 1031798777 has 38 samples
bin for bit pattern 1031798778 has 33 samples
bin for bit pattern 1031798779 has 41 samples
bin for bit pattern 1031798780 has 30 samples
bin for bit pattern 1031798781 has 29 samples
bin for bit pattern 1031798782 has 27 samples
bin for bit pattern 1031798783 has 31 samples

Note that no samples at all ended up in the bin for Float(0.0625).bitPattern == 1031798784, although there should be on average 64.

(In this test 1024*1024*512 values are returned, so on average the number 0.0625 should be returned:
(Float(0.0625).ulp*1024*1024*512)/0.0625 == 64 times.
and the values preceding it should be returned:
(Float(0.0625).nextDown.ulp*1024*1024*512)/Float(0.0625).nextDown == 32 times.)

I've run this program many times, and used the system generator too, and the value 0.0625 has never been returned.

And I first noticed this issue while using my Float8 type for the testing. Using that type I can perform tests like this much faster and more exhaustively, although I don't fully trust it yet. But as demonstrated here, the issue seems to exist with Float too.

The test uses

the WyRand generator

/// The wyrand generator, translated from:
/// https://github.com/wangyi-fudan/wyhash
struct WyRand : RandomNumberGenerator {

    typealias State = UInt64

    private (set) var state: State

    init(seed: UInt64? = nil) {
        self.state = seed ?? UInt64.random(in: .min ... .max)
    }
    init(state: State) {
        // Every UInt64 value is a valid WeRand state, so no need for this
        // initializer to be failable, it will satisfy the requirement anyway.
        self.state = state
    }

    mutating func next() -> UInt64 {
        state &+= 0xa0761d6478bd642f
        let mul = state.multipliedFullWidth(by: state ^ 0xe7037ed1a0b428db)
        return mul.high ^ mul.low
    }
}

because we'd have to wait many times longer for the system generator.

Nevin · March 5, 2020, 9:02pm

Thanks for the report, I’ll take a closer look at my fenceposts and see what I can find.

Jens · March 5, 2020, 9:39pm

        // Find section numbers
        let low = a.integerPosition(maxExponent: e)
-       let high = b.nextDown.integerPosition(maxExponent: e)
+       let high = b.integerPosition(maxExponent: e)

Maybe?

Nevin · March 5, 2020, 9:43pm

I think you’re right, let me double-check that it won’t mess anything up further down.

Nevin · March 5, 2020, 10:06pm

Okay, the issue is when b is exactly the lower bound of a section, in which case we want to subtract 1. To detect that…I guess this works:

let high = max(b.nextDown.integerPosition(maxExponent: e), b.integerPosition(maxExponent: e) &- 1)

I guess technically we don’t need to do anything more complicated than what you wrote, but I’m trying to avoid unnecessarily landing in a section that doesn’t intersect the target range.

Jens · March 5, 2020, 10:28pm

OK, I've added that fix to my copy and done some more testing.

I think there are two more issues, for ranges that span from neg to pos:

It returns both -0 and +0, as if they were separate values, leading to zero being twice as probable as it should be.
It will never return the lowest value(s).

Unless there's something wrong with my Float8 implementation.

I have this program:

func test2() {
    var prng = WyRand()
    let range = -Float8(0.25) ... Float8(0.25)
    var hist = [UInt8 : UInt64]()
    for _ in 0 ..< 1024*1024*16 {
        let f = Float8.uniformRandom(in: range, using: &prng)
        precondition(f >= range.lowerBound)
        precondition(f <= range.upperBound)
        hist[f.bitPattern, default: 0] &+= 1
    }
    for (v, c) in hist
        .map( { (k, v) in (Float8(bitPattern: k), v) })
        .sorted(by: { $0.0 < $1.0 })
    {
        print(v.leftPadded(to: 12), c.leftPadded(to: 12))
    }
}

which prints:

    -0.21875       508187
   -0.203125       509107
     -0.1875       508170
   -0.171875       508190
    -0.15625       509122
   -0.140625       506923
      -0.125       508131
  -0.1171875       254522
   -0.109375       254622
  -0.1015625       253875
    -0.09375       254150
  -0.0859375       254467
   -0.078125       255003
  -0.0703125       254489
     -0.0625       253645
 -0.05859375       126968
  -0.0546875       126530
 -0.05078125       127365
   -0.046875       127228
 -0.04296875       126859
  -0.0390625       127214
 -0.03515625       126985
    -0.03125       127250
-0.029296875        63996
 -0.02734375        63541
-0.025390625        63285
  -0.0234375        64074
-0.021484375        63689
 -0.01953125        63317
-0.017578125        63752
   -0.015625        63303
-0.013671875        63612
 -0.01171875        63768
-0.009765625        63651
  -0.0078125        63658
-0.005859375        63417
 -0.00390625        63388
-0.001953125        63283
         0.0        63822
        -0.0        63690
 0.001953125        63852
  0.00390625        63022
 0.005859375        63883
   0.0078125        63613
 0.009765625        63336
  0.01171875        63880
 0.013671875        63492
    0.015625        63690
 0.017578125        63412
  0.01953125        63818
 0.021484375        63175
   0.0234375        64183
 0.025390625        63378
  0.02734375        63788
 0.029296875        63606
     0.03125       127193
  0.03515625       126783
   0.0390625       126845
  0.04296875       127229
    0.046875       127361
  0.05078125       127118
   0.0546875       126800
  0.05859375       127299
      0.0625       254559
   0.0703125       254697
    0.078125       253568
   0.0859375       254446
     0.09375       253203
   0.1015625       253773
    0.109375       254352
   0.1171875       254890
       0.125       508204
    0.140625       507725
     0.15625       509499
    0.171875       509270
      0.1875       508547
    0.203125       508257
     0.21875       508844
    0.234375       507503
        0.25      1014875

Note that the expected first two values -0.25 and -0.234375 are missing, and that both -0 and 0 are there as separate values.

Nevin · March 5, 2020, 10:35pm

Thanks for the report, can you check which codepath it’s following, small range or large range? (ie. Does it enter the if let x = smallRangeUniformRandom block?)

Jens · March 5, 2020, 11:06pm

No, the test will not trap with the following change:

        // If the size of the range, counted in terms of the smallest ulp in the
        // range, fits in a RawSignificand, then we can choose a value directly.
        if let x = smallRangeUniformRandom(in: range, using: &generator) {
            fatalError() // <---
            return x
        }

FWIW, I've now "verified" that the same two issues exist with Float too (and not just with Float8).

And here's the Float8 implementation which should be all that's needed to compile and run the test.

EDIT: I'm not sure what's with this post (it's showing as half-transparent for me). I've now deleted and reposted it.

Nevin · March 5, 2020, 11:10pm

Thanks, looks like there’s a fencepost issue in lowerBound(ofSection:), I’ll see if I can get it straightened out.