When is the implementation of SE-0202: Random Unification allowed to change?

Jens · May 24, 2018, 9:38am

(EDIT: I have changed the title of this thread from "Will the Random API’s method for generating values of less than 8 bytes stay the same?" to "When is the implementation of SE-0202: Random Unification allowed to change?", since I agree with this post.)

As an example, if I ask for eight random bytes, using eight successive calls to a generator's generic next, or UInt8.random(in: , using: ), then the generator will actually have to produce 64 bytes (eight UInt64 values) and 56 of these 64 bytes will be thrown away.

That is, the current implementation generates random UInt8, UInt16 and UInt32 values by calling the UInt64-returning next() method once for each requested value, using only the lower 1, 2 and 4 bytes (respectively) and throwing away the remaining 7, 6 and 4 bytes.

The same is true when asking for two Float values (32 bits each, 64 bits for both), the generator will currently have to produce two 64 bit values, rather than only one.

Is this guaranteed to stay the same or should we be prepared to handle a change?

In cases where we use a custom pseudo random number generator and we want a specific repeatable result given a certain seed, this is important to know.

benrimmington · May 24, 2018, 11:58am

The _fill(bytes:) requirement was added by @lorentey in 12a2b32 and 54b3b8b.

Pending a proposal amendment by @Alejandro, custom generators could implement _fill(bytes:) by using a _UIntBuffer (or similar) to store unused bytes. But the default implementation would remain unchanged.

In any case, I think the results will be repeatable, given the same seed and sequence of API calls (even when 56 of those 64 bytes are thrown away).

Jens · May 24, 2018, 12:28pm

I mean that if we (in the future) will be getting eg four sequentially requested bytes — from a specifically seeded PRNG sequence — from offsets 0, 1, 2, 3 instead of (as is currently the case) byte offsets 0, 8, 16, 24, then the result will have been changed.

That is, throwing away bytes (as in the current implementation) will of course give a different result (for the same code, same seed, etc) than keeping and using all bytes. The generator would have to generate only a single UInt64 value (compared to the current 8) for eight requested bytes, if no bytes were thrown away.

Edit: Oh, I missed this:

: ) Thanks!

benrimmington · May 24, 2018, 1:07pm

This could be a more general question.

When is the implementation of SE-0202: Random Unification allowed to change?

whenever it needs to?
major Swift versions only?

For example, I think @scanon is planning to reimplement the BinaryFloatingPoint.random(in:using:) methods.

Double.random(in: -.greatestFiniteMagnitude ... +.greatestFiniteMagnitude)
// $R0: Double = +Inf

Double.random(in: -.greatestFiniteMagnitude ..< +.infinity)
// $R1: Double = +Inf

If this is fixed in Swift 4.2.1, then results from Swift 4.2.0 might not be repeatable.

beccadax · May 24, 2018, 9:16pm

In the case where you use a custom PRNG and a seed, the relevant question is how that PRNG is implemented, right?

I think we have to let it change when it needs to change. The vast majority of users who ask for a random number need it to be actually random far more urgently than they need it to reproduce the same bugs as the previous point release. If someone really needs to preserve the bugs in an old implementation more than they need the bugs fixed, we're open source—they can always look up the old, broken implementation and rename it to oldBrokenRandom(in:using:).

Jens · May 24, 2018, 9:53pm

(Edit: Yes, assuming that the PRNG implementation can provide custom implementations of the current _fill(bytes: ) method and any similar methods that transform the generator's output, and that might be subject to change. This is not currently the case, but as @benrimmington noted, it might be the case in the future for fill(bytes: ), but there are also other ones like the BinaryFloatingPoint.random(in:using:))

No, I'm of course assuming that the PRNG we are talking about is not somehow incorrectly implemented, ie I'm assuming that it's output is repeatable, and completely determined by a given seed, which is true for all PRNGs. I have already explained what I mean above, but here is an example program to make it clearer:

// For the purpose of this demonstration, this will just generate a sequence
// of bytes in increasing order (wrapping back to zero after 255). Not very
// random, I know, but it's only to make it easier to see my point, it could
// of course be any PRNG.
struct MyPrng : RandomNumberGenerator {
    var currentByteValue = 0 as UInt8
    mutating func next() -> UInt64 {
        var ui64 = 0 as UInt64
        for i in 0 ..< 8 {
            ui64 |= UInt64(truncatingIfNeeded: currentByteValue) &<< (i * 8)
            currentByteValue &+= 1
        }
        return ui64
    }
}

var prng = MyPrng()

let demoActualBytesGenerated = false

if demoActualBytesGenerated {
    // This will print the first eight bytes actually generated by this prng:
    var firstUInt64 = prng.next()
    withUnsafeBytes(of: &firstUInt64) { (byteBuf) -> Void in
        let bytes = byteBuf.map { $0 }
        print(bytes) // [0, 1, 2, 3, 4, 5, 6, 7]
    }
} else {
    // If the implementation used every generated byte, the following would
    // print [0, 1, 2, 3, 4, 5, 6, 7] but since the current implementation
    // generates a new UInt64 (8 bytes) for each requested byte, using only
    // the first and throwing away the remaining 7 bytes, it will print
    // [0, 8, 16, 24, 32, 40, 48, 56]:
    var bytes = [UInt8]()
    for _ in 0 ..< 8 {
        bytes.append(UInt8.random(in: 0 ... UInt8.max, using: &prng))
    }
    print(bytes) // [0, 8, 16, 24, 32, 40, 48, 56]
}

So, the point is that if the Random API implementation should change (so that it used every byte instead of throwing some away), then (using the exact same PRNG, seeded with the exact same seed) I would suddenly get [0, 1, 2, 3, 4, 5, 6, 7] instead of the [0, 8, 16, 24, 32, 40, 48, 56] which I get with the current implementation.

A practical example where this is relevant:

Someone could have written a game with procedurally generated levels, so that each level is described using only a PRNG seed value (the developer has examined millions of levels/seeds, and selected some good ones, and arranged them in order of increasing difficulty). Now, if the Random API implementation changed, the levels would be completely different, even though the code of the game (including the seeds for the levels) has not changed.

This is of course only an issue if the game uses methods like:

UIntX.random(in: using:)
Float.random(in: using: )
...

Ie, any method that ends up calling:

extension RandomNumberGenerator {
    public mutating func _fill(bytes buffer: UnsafeMutableRawBufferPointer) {
        ...
    }
}

Or some similar Random API which transforms the generated raw bits into some value of some type, and whose implementation might change.

But since avoiding all such methods essentially means avoiding the whole Random API, I think this is a very relevant question:

Note that I'm all for changing stuff whenever it needs to, as long as people know that this is the case (and that their eg PRNG-based procedurally generated levels might change with any new version of Swift). Knowing this, it would probably be wise for them to roll their own little separate random API in order to have full control over these things.

The general advice would be: If you depend on repeatable results using your PRNG, don't implement it as part of Swift's Random API, roll your own on the side.

Nobody1707 · May 24, 2018, 11:41pm

That sort of defeats the purpose of having the standard API though.

jawbroken · May 25, 2018, 2:40am

I think there's been a lot of misinterpretation of the original post here, or I'm misinterpreting the responses. If I may rephrase it, it's asking if the implementation might change so that, even though the sequence of UInt64 values returned from the PRNG is the same, the sequence of UInt8 (and similar) generated from this UInt64 sequence is different.

The concern implied in the post is that there might be some sort of buffering added, so that the unused bits could be returned in future calls (e.g. generating one UInt64 value then returning it byte-by-byte for the next 8 calls for a random UInt8). But an even simpler concern would be that the way truncation is done might change, e.g. from returning the least significant byte to returning the most significant byte. Without a documented guarantee in this area, you can't rely on reproducing anything other than the UInt64 sequence for a given seed, making this API very tricky to use correctly with PRNGs, as @Jens notes.

Jens · May 25, 2018, 7:01am

That's exactly what I mean. Thanks for helping me rephrase it!

Once again, I'd like to note that I am in favor of changing the Random API "whenever it needs to", at least until it's more mature. I only think that it has to be decided and communicated in a way so that everyone using the API knows this.