Improving Float.random(in:using:)

That’s a question of what semantics we desire.

Imagine a true uniform distribution on the closed interval [1, 2].

Let’s choose a real number uniformly at random from that distribution, call it x.

Now if we round x to the nearest representable Float, each representable float (call it f) strictly inside the interval has a “basin” of width Float.ulpOfOne centered on it where if x falls in that basin, then x rounds to f.

But the values 1 and 2 each only have half of their basins within the interval, so they are half as likely to be generated as the floats between them.

Conversely, if we were drawing from the half-open interval [1, 2), then each float’s basin would be [f, f.nextUp), and they would all be equally likely (with 2 impossible.)

1 Like

Thanks, that cleared it up for me (I thought it had to do with the different distance between represented values). I guess it also makes sense when we have a range that eg spans symmetrically over negative and positive values. We can then set the sign bit (depending on a random bit) and the -0s and 0s won't add up to make zero twice as probable as it should be.

Yes, and I think the half open range feels like a simpler base case to build a more general version on.

This also implies that the following would compile and return any finite non-negative Float value:

let anyNonNegativeFiniteValue = Float.random(in: 0 ..< .infinity)

but the current implementation results in this (IMO invalid) runtime error:

Fatal error: There is no uniform distribution on an infinite range

That range is not infinite, it is the finite range of all values (greater than or equal to zero and) less than infinity (Float.infinity.nextDown == Float.greatestFiniteMagnitude). It is the same range as this:

let anyNonNegativeFiniteValue = Float.random(in: 0 ... .greatestFiniteMagnitude)

(which runs without any runtime error.)


So (@scanon and @Nevin), am I correct in assuming that an improved
Float.random(in: using: )
should accept and behave identically for eg:

let anyNonNegativeFiniteValue1 = Float.random(in: 0 ..< .infinity)
let anyNonNegativeFiniteValue2 = Float.random(in: 0 ... .greatestFiniteMagnitude)
let anyFiniteValue1 = Float.random(in: -.greatestFiniteMagnitude ..< .infinity)
let anyFiniteValue2 = Float.random(in: -.greatestFiniteMagnitude ... .greatestFiniteMagnitude)

?

3 Likes

Another thing regarding this, I'm actually not sure I understand why 1 should have twice the probability of being chosen when we draw from [1, 2) than when we draw from [1, 2].
Or put differently, why should the basin of each Float value change between (A):

   •     •     •     •     •     •     •     •
⎣____⎠⎣____⎠⎣____⎠
   |     |     |
   |     |     '--- v.nextUp
   |     '----------v
   '----------------v.nextDown

and (B):

   •     •     •     •     •     •     •     •
   ⎣____⎠⎣____⎠⎣____⎠
   |     |     |
   |     |     '--- v.nextUp
   |     '----------v
   '----------------v.nextDown

as we change between
(A) drawing from a closed interval
and
(B) drawing from a half open interval.


Here's how I think about it:

The boundaries of the interval we are drawing from can only be specified using representable Float values, so drawing from
x ...< y should essentially be equivalent to drawing from
x ... y.nextDown
and vice versa; drawing from
x ... y should essentially be equivalent to drawing from
x ...< y.nextUp
so in the following code example (assuming the improved Float.random(in: using:)), the precondition should be satisfied:

let (prngA, prngB) = (MyPrng(seed: 1234), MyPrng(seed: 1234))
for _ in 0 ..< 10_000_000 {
    let rA = Float.random(in: 5 ..< 6, using: &prngA)
    let rB = Float.random(in: 5 ... 6.nextDown, using: &prngB)
    precondition(rA == rB)
}

which means the ClosedRange<Float>-variant could be implemented by just letting it call the Range<Float> variant with upperBound.nextUp.
(remember that Float.infinity.nextDown == Float.greatestFiniteMagnitude and Float.greatestFiniteMagnitude.nextUp == Float.infinity, and thus .infinity would be a valid upper bound only for the Range<Float>-variant.)

I guess I'm missing something?

1 Like

Excellent thoughts Jens, I’m glad you brought this up. I’ve actually been working on an implementation, and the first draft is done and working. Once I finish adding documentation, I’m going to post it and ask for feedback (especially from Steve).

And yes, I had many of the same thoughts as you regarding the implementation details. I intend to start an Evolution Discussion thread to ask for input.

My first-draft only handles Range not ClosedRange, though as you note the latter could be built from the former if we choose compatible semantics. It currently behaves as, “Pick a real number in the range and round down to the next representable value”. And it does indeed work with an upper bound of infinity (treating it as one ulp beyond greatestFiniteMagnitude).

As for small ranges, the test case that really made me think was a range just 2 ulps wide, entirely within a single binade, such as 1.5.nextDown ..< 1.5.nextUp. The current documentation and implementation in the Standard Library make the middle value (1.5) occur twice as often as the lower bound, ie. probabilities of [1/3, 2/3]. (And for a closed range, the middle value also occurs twice as often as the upper bound, ie. [1/4, 1/2, 1/4].)

I think that is unintuitive and generally not what people want or expect. There are 3 “simple” alternatives we could choose:

  1. p(x) is proportional to x.ulp
  2. p(x) is proportional to x.nextUp - x
  3. p(x) is proportional to (x.nextUp - x.nextDown) / 2

For the vast majority of numbers, these are all equivalent. The differences only show up at the boundaries of binades, specifically for non-zero values with 0 significand.

Now, #1 has the advantage that all values with the same raw exponent, have the same probability of occurring. It seems conceptually the simplest to explain.

#2 is the “obvious” answer for half-open ranges, namely choosing a real number at random then rounding down to a representable value. However it makes negative numbers with 0 significand half as likely as their positive counterparts.

And #3 embodies the spirit of “round to nearest”, while avoiding edge effects by treating each bound as if its whole “basin of attraction” were included. However, it makes all non-zero numbers with 0 significand 75% as likely as the other numbers with the same raw exponent.

In every case, we could treat infinity as one ulp beyond greatestFiniteMagnitude, and also never produce negative zero.

1 Like

0 ..< .infinity is an infinite range. The uniform distribution in question is a uniform distribution of real numbers, not a uniform distribution of IEEE-754 values. If you want 0 ... .greatestFiniteMagnitude, why wouldn't you say that?

EDIT: for a finite example, consider choosing a random uniformly distributed decimal number in 0..<1 with one digit of precision, vs. 0...0.9. There are a lot of complications to consider, but you wouldn't expect the latter range to have nearly as many 0.9s in it as the former range.

3 Likes

…because I haven’t finished implementing the closed-range version of random. :-)

It is a range specifying all real numbers greater than or equal to Float.zero and less than Float.infinity (not including Float.infinity, but including Float.infinity.nextDown == Float.greatestFiniteMagnitude). Ie, any representable Float value that is contained in that range is finite.

I would expect (an improved)
Float.random(in: 1 ..< 2, using: &prng)
to return any representable value in that range (including 1 and excluding 2), and given the exact same prng value (same state), I would expect the exact same result from:
Float.random(in: 1 ... 2.nextDown, using: &prng)
it follows from this that:
Float.random(in: 0 ..< .infinity, using: &prng)
should return any representable non-negative finite Float value (excluding inf), and be equivalent to:
Float.random(in: 0 ... .infinity.nextDown, using: &prng)
which is the same as:
Float.random(in: 0 ... .infinity.greatestFiniteMagnitude, using: &prng).

You guys are collectively dancing around an important semantic distinction in "what does it mean to select a random floating-point number?"

  • It can mean "select a random real number according to a specified distribution, then round that real number to a representable floating-point value according to a specified rounding rule."

  • Or it can mean "select a random floating-point value according to a specified discrete distribution."

Both of these are perfectly legitimate viewpoints, but the choice of viewpoint (and the choice of rounding rule in the first case) lead to different results.

If you adopt the first viewpoint, random(in: 0 ..< .infinity) is obviously ill-formed. If you adopt the second viewpoint, you can make it well-formed if you squint hard enough.

My opinion is that the first viewpoint is the sound and useful one. It matches the semantics of other IEEE 754 operations much more closely. Also, if you only have the first one, you can simulate the second behavior pretty easily (either by choosing the right rounding rule, or by using a distribution on bit patterns). Going the other way is more difficult and error prone.

The only gotcha with the first viewpoint is what happens at the end-points of the interval, but this is mostly a problem of API design and documentation, not numerics.

Ideally, we'll eventually have a more flexible API that lets you specify a rounding rule, but in the meantime for half-open intervals, the most explainable behavior is to select a random real number and round it down to a representable value, and for closed intervals the most explainable behavior is probably to select a random real number in [a, b.nextUp) and follow the first rule, with a little special handling when b is greatestFiniteMagnitude. There are a few other defensible options in the closed-range case, however.

Also note that this is a very subtle issue that no language has reached a truly satisfactory solution for yet (heck, most languages will happily return b--or even b.nextUp--when you ask for a random number in [a, b)).

8 Likes

If we consider the bounds of the range to be exactly the representable values provided, and round to nearest, then the behavior for small ranges (like the 2-ulp-wide example I gave earlier) is probably not ideal.

For example, with that viewpoint, choosing a random number in either [x-d ..< x) or [x ..< x+d) with 50% probability, gives a different distribution than choosing at random in [x-d ..< x+d)

If we consider the range to instead be slightly expanded so it includes the full basin of attraction for its bounds, then the behavior might better match expectations.

1 Like

For some reason, I'd prefer the default to be this:

  • It can mean "select a random real number according to a specified distribution, then round that real number to a representable floating-point value according to a specified rounding rule."

with the rounding rule being towards negative infinity.

Assuming this view, I have a hard time seeing how Float.zero ..< .infinity could be interpreted as containing any infinite values, and why
Float.random(in: 0 ..< .infinity)
should not be equivalent to
Float.random(in: 0 ... .greatestFiniteMagnitude),
just like
Float.random(in: 1 ..< 2)
should be equivalent to
Float.random(in: 1 ... 2.nextDown).

I think you added this in an edit while I was replying. This is essentially the strategy I was pursuing.

Because there is no such thing as a uniform distribution on [a, ∞). It's ill-formed.

Tyler Campbell also did a nice writeup on [0,1] specifically based on a conversation with me, Alex Barnett, and Gerald Sussman a few years ago. My thinking has changed a little bit since then, but it's still one of the better write-ups.

3 Likes

Thanks for the link, I’ll have to give it an in-depth read, but at a glance it appears to take a very different approach than I’m using.

My version almost always uses either just one random RawSignificand (for small ranges), or that plus one random UInt64 (for large ranges). In rare cases near the ends of the range it needs to try again, but that should only happen a negligible amount of the time.

Oh, well, I guess my mental model of Float.infinity is "the value after the greatest finite value" and not the actual mathematical ∞.

Same thing with Range<Float>, I don't see or use it as if it was an actual mathematical half-open interval of real numbers.

For example, I'd consider the following code to be perfectly well-formed:

if (Float.zero ..< Float.infinity).contains(x) { ... }

although I'd prefer

if x >= 0 && x.isFinite { ... }

One place where that mental model falls down for me: if BinaryFloatingPoint had a comparison function that worked across different floating-point types (like the one we have for BinaryInteger, totally implementable), Double.greatestFiniteMagnitude < Float.infinity should be true, but Double.greatestFiniteMagnitude <= Float.greatestFiniteMagnitude is definitely false.

EDIT: note that this applies to finite numbers as well: (1.0 as Float) < (1.0 as Double).nextUp < (1.0 as Float).nextUp.

Your example is well-formed for me too; it just doesn't represent the same range as 0 ... .greatestFiniteMagnitude.

My mental model doesn't place .infinity "close to" it's preceding value, or even at eg .greatestFiniteMagnitude + .greatestFiniteMagnitude.ulp, but rather infinitely far away from it, so the following is just as expected:

Double.greatestFiniteMagnitude < Double(Float.infinity) // true
Double.greatestFiniteMagnitude <= Double(Float.greatestFiniteMagnitude) // false

In what way? Ie, in what way does a and b not represent the same range here:

let a = Float.zero ...< Float.infinity
let b = Float.zero ... Float.greatestFiniteMagnitude

?
I mean, for example a.contains(x) == b.contains(x) will hold for any x.

I really don't see what is meant to be surprising or mental model breaking here, the distance between representable values around Float(1) is simply much greater than the distance between representable values around Double(1), ie

print(Float.ulpOfOne) // 1.1920929e-07
print(Double.ulpOfOne) // 2.220446049250313e-16

because Float has fewer bits in its significand than Double to spread over eg [1, 2].

I'm not sure I can argue this point successfully, but having the same observable behavior does not make two things the same. Notification.Name and NSImage.Name have the same (instance) API, but that doesn't make them the same. And the very fact that we're considering making Float.random(in:using:) behave differently across the two types is evidence for this.

To your second comment, I consider the fundamental axiom of Range.contains(_:) to be lowerBound <= x < upperBound and ClosedRange.contains(_:) to be lowerBound <= x <= upperBound. So if there's behavior that differs between x < y and x <= y.nextDown, I would consider that to affect the interpretation of Range and ClosedRange, even if you can't actually get at that behavior with contains(_:).

1 Like