Creating an array of random numbers in Swift is slow

If we’re discussing Swift’s floating point random-in-range methods, then I would like to point out that the original proposal SE–202 did not fully specify the semantics, but the author said in the review thread that the intended behavior was to function as if a real number were chosen uniformly in the range, then rounded to a representable value.

This is further supported by the accompanying documentation comment in the standard library:

However those semantics were never implemented. The implementation instead selects among equally-spaced values, which leads to a few problems such as:

[SR-8798] BinaryFloatingPoint.random(in:) crashes on some valid ranges
[SR-12765] BinaryFloatingPoint.random(in:) cannot produce all values in range

Some years back I drafted a PR to address these issues and implement the intended uniform-real-then-round semantics. However that PR remains unmerged, perhaps in part because my implementation was rather complicated. I expect it should be possible to simplify the code, but I haven’t gone back to try.

Also worth mentioning that changes like this were expressly foreseen and expected, as the very same doc comment states:

3 Likes

Interesting challenge. Is the following algorithm conceptually correct to achieve the desired goal (obviously quite slow):

func random(from: Double, to: Double, step: Int = 1) -> Double {
    print("step #\(step), choose random from: \(from) to: \(to), delta: \(to - from)")
    if from == to { return from }
    let middle = (from + to)/2
    return if Bool.random() {
        random(from: from, to: middle, step: step + 1)
    } else {
        random(from: middle, to: to, step: step + 1)
    }
}

let result = random(from: 0, to: 1)
print("result: \(result)")

Example output:

step #1, choose random from: 0.0 to: 1.0, delta: 1.0
step #2, choose random from: 0.5 to: 1.0, delta: 0.5
step #3, choose random from: 0.75 to: 1.0, delta: 0.25
step #4, choose random from: 0.75 to: 0.875, delta: 0.125
step #5, choose random from: 0.75 to: 0.8125, delta: 0.0625
step #6, choose random from: 0.78125 to: 0.8125, delta: 0.03125
step #7, choose random from: 0.78125 to: 0.796875, delta: 0.015625
step #8, choose random from: 0.7890625 to: 0.796875, delta: 0.0078125
step #9, choose random from: 0.7890625 to: 0.79296875, delta: 0.00390625
step #10, choose random from: 0.791015625 to: 0.79296875, delta: 0.001953125
step #11, choose random from: 0.7919921875 to: 0.79296875, delta: 0.0009765625
step #12, choose random from: 0.79248046875 to: 0.79296875, delta: 0.00048828125
step #13, choose random from: 0.79248046875 to: 0.792724609375, delta: 0.000244140625
step #14, choose random from: 0.7926025390625 to: 0.792724609375, delta: 0.0001220703125
step #15, choose random from: 0.7926025390625 to: 0.79266357421875, delta: 6.103515625e-05
step #16, choose random from: 0.792633056640625 to: 0.79266357421875, delta: 3.0517578125e-05
step #17, choose random from: 0.792633056640625 to: 0.7926483154296875, delta: 1.52587890625e-05
step #18, choose random from: 0.7926406860351562 to: 0.7926483154296875, delta: 7.62939453125e-06
step #19, choose random from: 0.7926445007324219 to: 0.7926483154296875, delta: 3.814697265625e-06
step #20, choose random from: 0.7926464080810547 to: 0.7926483154296875, delta: 1.9073486328125e-06
step #21, choose random from: 0.7926473617553711 to: 0.7926483154296875, delta: 9.5367431640625e-07
step #22, choose random from: 0.7926473617553711 to: 0.7926478385925293, delta: 4.76837158203125e-07
step #23, choose random from: 0.7926476001739502 to: 0.7926478385925293, delta: 2.384185791015625e-07
step #24, choose random from: 0.7926477193832397 to: 0.7926478385925293, delta: 1.1920928955078125e-07
step #25, choose random from: 0.7926477789878845 to: 0.7926478385925293, delta: 5.960464477539063e-08
step #26, choose random from: 0.7926477789878845 to: 0.7926478087902069, delta: 2.9802322387695312e-08
step #27, choose random from: 0.7926477789878845 to: 0.7926477938890457, delta: 1.4901161193847656e-08
step #28, choose random from: 0.7926477864384651 to: 0.7926477938890457, delta: 7.450580596923828e-09
step #29, choose random from: 0.7926477901637554 to: 0.7926477938890457, delta: 3.725290298461914e-09
step #30, choose random from: 0.7926477920264006 to: 0.7926477938890457, delta: 1.862645149230957e-09
step #31, choose random from: 0.7926477929577231 to: 0.7926477938890457, delta: 9.313225746154785e-10
step #32, choose random from: 0.7926477929577231 to: 0.7926477934233844, delta: 4.656612873077393e-10
step #33, choose random from: 0.7926477931905538 to: 0.7926477934233844, delta: 2.3283064365386963e-10
step #34, choose random from: 0.7926477933069691 to: 0.7926477934233844, delta: 1.1641532182693481e-10
step #35, choose random from: 0.7926477933651768 to: 0.7926477934233844, delta: 5.820766091346741e-11
step #36, choose random from: 0.7926477933651768 to: 0.7926477933942806, delta: 2.9103830456733704e-11
step #37, choose random from: 0.7926477933797287 to: 0.7926477933942806, delta: 1.4551915228366852e-11
step #38, choose random from: 0.7926477933797287 to: 0.7926477933870046, delta: 7.275957614183426e-12
step #39, choose random from: 0.7926477933797287 to: 0.7926477933833667, delta: 3.637978807091713e-12
step #40, choose random from: 0.7926477933815477 to: 0.7926477933833667, delta: 1.8189894035458565e-12
step #41, choose random from: 0.7926477933824572 to: 0.7926477933833667, delta: 9.094947017729282e-13
step #42, choose random from: 0.7926477933824572 to: 0.7926477933829119, delta: 4.547473508864641e-13
step #43, choose random from: 0.7926477933824572 to: 0.7926477933826845, delta: 2.2737367544323206e-13
step #44, choose random from: 0.7926477933825709 to: 0.7926477933826845, delta: 1.1368683772161603e-13
step #45, choose random from: 0.7926477933825709 to: 0.7926477933826277, delta: 5.684341886080802e-14
step #46, choose random from: 0.7926477933825709 to: 0.7926477933825993, delta: 2.842170943040401e-14
step #47, choose random from: 0.7926477933825709 to: 0.7926477933825851, delta: 1.4210854715202004e-14
step #48, choose random from: 0.7926477933825709 to: 0.792647793382578, delta: 7.105427357601002e-15
step #49, choose random from: 0.7926477933825709 to: 0.7926477933825744, delta: 3.552713678800501e-15
step #50, choose random from: 0.7926477933825726 to: 0.7926477933825744, delta: 1.7763568394002505e-15
step #51, choose random from: 0.7926477933825726 to: 0.7926477933825735, delta: 8.881784197001252e-16
step #52, choose random from: 0.7926477933825731 to: 0.7926477933825735, delta: 4.440892098500626e-16
step #53, choose random from: 0.7926477933825731 to: 0.7926477933825733, delta: 2.220446049250313e-16
step #54, choose random from: 0.7926477933825732 to: 0.7926477933825733, delta: 1.1102230246251565e-16
step #55, choose random from: 0.7926477933825733 to: 0.7926477933825733, delta: 0.0
result: 0.7926477933825733
1 Like

Since this line is subject to the vagaries of floating-point rounding, it will not achieve uniformity in practice because the two sub-intervals may have slightly different lengths.

2 Likes

I'm a bit surprised to learn there aren't any PRNGs other than SystemRandomNumberGenerator readily available in the standard library nor any well known Swift packages (like Swift Numerics). Or am I just failing to find them?

If so, isn't this a very low hanging fruit to improve performance in Swift programs that don't require a cryptographically secure PRNG? Just replacing the system PRNG for a Mersenne Twister implementation I found off the internet grants me a ~12x speed increase in throughput (after initialization...), and it isn't a particularly fast algorithm.

I know there are Swift packages out there implementing popular PRNG algorithms, but verifying that PRNG output has good quality output is not super easy, so it's not quite the thing where I'd trust a random (ha!) library author to write a bug-free implementation.

Most of the fast modern PRNGs are pretty simple to review. It shouldn't take too much effort to show that an implementation of Xoshiro256++ is identical to the official C implementation, for instance.

@inlinable @_transparent
public func rotl(_ x: UInt64, _ k: Int) -> UInt64 {
  (x &<< k) | (x &>> (64 &- k))
}

@frozen
public struct Xoshiro256PlusPlus: RandomNumberGenerator {
  @usableFromInline
  var s: (UInt64, UInt64, UInt64, UInt64)
  @inlinable
  public mutating func next() -> UInt64 {
    let result = rotl(s.0 &+ s.3, 23) &+ s.0

    let t = s.1 << 17

    s.2 ^= s.0
    s.3 ^= s.1
    s.1 ^= s.2
    s.0 ^= s.3

    s.2 ^= t

    s.3 = rotl(s.3, 45)

    return result
  }
  @inlinable
  public init?(seed: (UInt64, UInt64, UInt64, UInt64)) {
    guard seed != (0, 0, 0, 0) else { return nil }
    s = seed
  }
  @inlinable
  public init(from rng: inout some RandomNumberGenerator) {
    var seed: (UInt64, UInt64, UInt64, UInt64)
    repeat {
      seed = (rng.next(), rng.next(), rng.next(), rng.next())
    } while seed == (0, 0, 0, 0)
    s == seed
  }
}

Original implementation (source):

static inline uint64_t rotl(const uint64_t x, int k) {
	return (x << k) | (x >> (64 - k));
}


static uint64_t s[4];

uint64_t next(void) {
	const uint64_t result = rotl(s[0] + s[3], 23) + s[0];

	const uint64_t t = s[1] << 17;

	s[2] ^= s[0];
	s[3] ^= s[1];
	s[1] ^= s[2];
	s[0] ^= s[3];

	s[2] ^= t;

	s[3] = rotl(s[3], 45);

	return result;
}

EDIT: Fixed typo, that's what I get for coding this from memory.

This reminded me of xkcd/2501 a lot :grinning_face_with_smiling_eyes:

Yes, if you know what you're looking for, this is a relatively minor nuisance. I think it took me about ten minutes from reading the thread to having a benchmark running with Mersenne Twister as the PRNG (I chose Mersenne Twister because that's what NumPy uses, or at least used to), based on a Swift 4 implementation I found in the first few Google search results.

But there is a big jump from realizing that Swift is slow here due to using a cryptographically secure PRNG to being able to compare a Xoshiro256++ Swift implementation against the official C implementation, or even knowing what Xoshiro is or how it compares to other PRNGs.

1 Like

Thank you everyone for your comments. There is definitely a lot that can be done in this area, especially for Swift. But now I have a better understanding of what Double.random is doing and why it can be slow compared to other random number generators. I wrote a post that benchmarks some of the methods discussed here and I compared the elapsed times for generating a random array of 100,000,00 elements. The table below summarizes the benchmark results. The Time column is the mean elapsed time in milliseconds. The Speedup column is how much faster WyRand uniform is compared to the other methods. The wyrand method was about 50x faster than the default Swift random. If you don't need cryptographically secure random numbers then using a custom random generator can definitely provide better performance.

Method Filename Time (ms) Speedup
Swift random rand.swift 3318 50.5x
LAPACK dlarnv lapack.swift 407 6.2x
Accelerate drand48 rand48.swift 185 2.8x
WyRand random wyrand.swift 81 1.2x
WyRand uniform wyuniform.swift 65 -
2 Likes

This part of the blog post is not quite correct. Specifically, the provided code can produce a value of 1.

Since Double has a 52-bit significand, its representable values in [263, 264) are spaced apart by 211 = 2048. Thus with the default rounding rule, any value in [264-1024, 264) will round up to 264 when converted to Double.

In particular, when r takes any of the 1024 integer values in (UInt64.max - 1023) ... UInt64.max, the resulting value of uniform will be exactly 1.

If you want to achieve the described semantics, then you need the conversion to Double to use either “round down” or “round toward zero”.

2 Likes

The usual technique to do "quick-n-dirty but unbiased and without the endpoint" 0..<1 sampling is (I'm writing this in the browser window and not testing it, so, ymmv, but conceptually this is right):

func double01(_ randomBits: UInt64) -> Double {
    let masked = randomBits & 0xf_ffff_ffff_ffff
    return Double(bitPattern: 1.0.bitPattern | masked) - 1
}

This will only produce dyadic fractions of the form n/2⁵², but it is uniform over those values, which makes it suitable for most casual use and quite fast.

10 Likes

The example below does indeed give a result of 1.0 so you're right about the code producing a value of 1.

let a = UInt64.max
let b = 18446744073709551616.0
let c = Double(a) / b
// c is 1.0

Using the nextUp property gives me the desired result as shown below. Is this a reasonable solution?

let a = UInt64.max
let b = Double(UInt64.max).nextUp
let c = Double(a) / b
// c is 0.99999999999999978

I tried your fancy function with the array example that I've been using in this thread for testing. I get the same performance compared to the following:

let uniform = Double(r) / Double(UInt64.max).nextUp

Have you tried compiling and running your example without @frozen and @inlinable? Does it make much of a performance difference?

That tells you that the bottleneck for your generator is either the throughput of your random bitstream generation, or the cost of writing the results to memory in your array, rather than the transformation that maps your random bits to doubles.

Performance aside, your proposed Double(r) / Double(UInt64.max).nextUp will never produce 1.0, but it does introduce a very slight bias (essentially any method that involves division by a non-power-of-two without rejection will have bias). For example, there are 2047 inputs r that produce the result 0.500000000014551915228366851806640625, but 2049 inputs r that produce the next larger double. The main advantage of the "usual" one I mentioned, and the reason people use it, is that it is dead simple to prove that it is free of bias.

4 Likes

Can you please explain these two lines in detail? I don't fully understand what is going on here but the 1.0.bitPattern looks like you get to a range of [1.0, 2.0) then minus 1 to get [0, 1).

let masked = randomBits & 0xf_ffff_ffff_ffff
return Double(bitPattern: 1.0.bitPattern | masked) - 1

Also, are there some references you can recommend I read to understand techniques like this?

The wyhash repo contains a wy2u01 function which is shown below.

// Convert any 64 bit pseudo random numbers to uniform distribution [0, 1)
// It can be combined with wyrand, wyhash64 or wyhash
static inline double wy2u01(uint64_t r) {
    const double _wynorm = 1.0 / (1ull<<52);
    return (r>>12) * _wynorm;
}

The Swift version is below. Is this equivalent to the function you posted?

func wy2u01(r: UInt64) -> Double {
    let _wynorm = 1.0 / Double(1 << 52)
    return Double(r >> 12) * _wynorm
}
2 Likes

Yes, that’s equivalent*. (You can write _wynorm as the literal 0x1.0p-52, by the way.)

(*) it uses the high-order 52 bits of the source, instead of the low-order 52 bits, so it won’t produce the same results from a deterministic bit stream source, but they have identical distributions.

3 Likes

That's the gist of it. 1.0.bitPattern is 0x3ff0_0000_0000_0000, so 1.0.bitPattern | masked is a uniform random integer between 0x3ff0_0000_0000_0000 (a.k.a 1.0) and 0x3fff_ffff_ffff_ffff (aka 2.0.nextDown). Subtracting 1 translates the interval to [0, 1), and is always exact by Sterbenz' lemma,¹ so it cannot introduce bias due to rounding. Thus, this gives us a uniform unbiased sample on the dyadic rationals 0/2⁵² ... (2⁵²-1)/2⁵².


¹ roughly, if a and b are positive and a/2 <= b <= 2a, then a - b is computed exactly in binary floating-point.

9 Likes

I'm curious about generating unit doubles.

https://prng.di.unimi.it recommends the C99 code(x >> 11) * 0x1.0p-53 (with x being a uint64_t). I translated it to Swift as

extension RandomNumberGenerator {
	@inlinable public mutating func unitDouble() -> Double {
		let x = next()
		return Double(x >> 11) * 0x1.0p-53
	}
}

Would one approach be preferable over the other in Swift?

That's also unbiased and does not generate 1.0, so will be fine for most casual use.

1 Like

There's a function/intrinstic that does what exactly you want: count the leading zeros of the random stream. The number of zeros then dictates the mantissa of the result.

That enables the "takes variable number of bits" route to generate every and all double numbers in [0,1). The gist is simple: count leading zeros, either until you hit a 1 or reach subnormal; then pick the next 52 bits as the significand. However, it'd be quite complicated if the goal were to request no more double/quad words than necessary from the RNG - as you can't return the leftover to it later!