Improving Float.random(in:using:)

I ran a quick little performance test which shows that it has become faster.

Test code
extension BinaryFloatingPoint where
    RawSignificand: FixedWidthInteger,
    RawExponent: FixedWidthInteger
{
    static func test(range: ClosedRange<Self>) {
        var generator = WyRand(state: 1234567890123)
        var cs = RawSignificand(0)
        let t0 = DispatchTime.now().uptimeNanoseconds
        for _ in 0 ..< 1024*1024*8 {
            let f = uniformRandom(in: range, using: &generator)
            cs ^= f.significandBitPattern
            cs ^= RawSignificand(f.exponentBitPattern)
        }
        let t1 = DispatchTime.now().uptimeNanoseconds
        print(" ", Double(t1-t0)/1e9, "seconds (checksum: \(cs))")
    }
    static func test() {
        print(self)
        test(range: -greatestFiniteMagnitude ... greatestFiniteMagnitude)
        test(range: 0 ... greatestFiniteMagnitude)
        test(range: -greatestFiniteMagnitude ... 0)
        test(range: Self(0) ... Self(1024).nextDown)
    }
}
Float32.test()
Float64.test()
Float80.test()

Test results

Latest version:

Float
  0.257602232 seconds (checksum: 138254)
  0.241238388 seconds (checksum: 3957719)
  0.262408154 seconds (checksum: 3631585)
  0.241221168 seconds (checksum: 3957703)
Double
  0.283161892 seconds (checksum: 2981475560360739)
  0.257376181 seconds (checksum: 315125084029635)
  0.272264235 seconds (checksum: 759283211863005)
  0.254052645 seconds (checksum: 315125084030247)
Float80
  1.322677751 seconds (checksum: 1370150508309074436)
  1.298046256 seconds (checksum: 8536050357597743157)
  9.555791118 seconds (checksum: 4117734076763979444)
  1.287667337 seconds (checksum: 8536050357597755341)

Previous version:

Float
  0.290868295 seconds (checksum: 138254)
  0.268503795 seconds (checksum: 3957719)
  0.29267472 seconds (checksum: 3631585)
  0.27050553 seconds (checksum: 3957703)
Double
  0.339597436 seconds (checksum: 2981475560361275)
  0.324039793 seconds (checksum: 315125084030973)
  0.338665427 seconds (checksum: 759283211851814)
  0.317372234 seconds (checksum: 315125084028953)
Float80
  1.635097467 seconds (checksum: 1370150508312549190)
  1.630609227 seconds (checksum: 8536050357585630166)
  9.768561795 seconds (checksum: 4117734076772366346)
  1.600339114 seconds (checksum: 8536050357585620014)

Any idea why the third range (-geatestFiniteMagnitude ... 0) is so slow for Float80?
(I didn't notice this before but that range is slow for both the latest and the previous version.)


Another possible optimization (that I guess you might've already thought about): I think it could be about 4 times faster for ranges that spans whole "raw exponent binades" exactly, by using the method described in the paper and which we tried upthread (ie choosing raw exponent by counting consecutive trailing/leading zeroes and then filling raw significand with random bits).