Introduction
I propose adding a new method to SystemRandomNumberGenerator: randomBytes(count:), which will enable generating an arbitrary number of random bytes. This is a minor quality of life pitch that can provide a small but meaningful performance improvement to randomness-heavy code.
Motivation
SE-0202 introduced the RandomNumberGenerator protocol and provided a single standard library conforming type: SystemRandomNumberGenerator. This SystemRandomNumberGenerator implementation was deliberately designed to be cryptographically secure, allowing it to be used as a source of randomness when working in cryptographically sensitive contexts.
While this is a great first step, the implementation as defined has one major deficiency when used in cryptographic contexts: the only mechanism to obtain random numbers caps the maximum amount of randomness that can be extracted in one call at 64 bits. This is because the entire RandomNumberGenerator protocol is:
public protocol RandomNumberGenerator {
mutating func next() -> UInt64
}
Unfortunately it is vanishingly rare that 64 bits of randomness is sufficient in a cryptographic context. When used for generating random keys or random secrets, the absolute minimum number of bits necessary in almost any case is 128, and frequently 256 bits are needed. This necessitates multiple calls to next() into order to get the full quantity of random bytes. As those bytes are often required to be in a form that can be passed to a C library or converted to a large integer type, it is also quite common to need to pass these integers through a raw pointer type, requiring awkward pointer management.
In addition to the above minor awkwardness, there is a performance cost incurred here on many platforms. SystemRandomNumberGenerator uses:
-
arc4random_buf on Apple platforms
-
getrandom when available on Linux
-
getentropy on a grab bag of platforms
-
/dev/urandom on both Linux and the grab bag in cases when getrandom or getentropy aren't available.
On Apple platforms the extra calls don't matter too badly as arc4random_buf is provided by libc. However, getrandom and getentropy are syscalls, necessitating a userspace->kernel context switch for each call. /dev/urandom requires an amortised 1 syscall (open is amortised across the runtime of the program). Anything requiring syscalls may naturally require more if they are interrupted by a signal.
As cryptographic code is already computationally fairly expensive, requiring multiple syscalls in this hot CPU path is less than ideal. Given that these lower level APIs are quite capable of returning us more than 8 bytes at a time (and in fact the underlying Swift implementation is written for an arbitrary quantity of randomness), I propose we provide an alternative path when more randomness is required.
Proposed Solution
Extend SystemRandomNumberGenerator with a new function.
extension SystemRandomNumberGenerator {
/// Provides `count` random bytes from the system
/// random number generator.
func randomBytes(count: Int) -> [UInt8]
}
This function would return an Array containing count random bytes. In principle count is unlimited, though in practice the maximum value is constrained by the various system APIs. A reasonable upper bound for a single call is UInt32.max, as that covers the platforms with the most severe restrictions (Windows can only do 232 bits in one go). This proposal would suggest baking that limitation into the API documentation, as once you're asking for 232 bytes of randomness the cost of the syscall starts being less than the cost of shuffling memory around.
The implementation of this function is trivial, as the currently existing swift_stdlib_random function already supports this use-case.
Effect on ABI stability
None.
Alternatives Considered
Extending RandomNumberGenerator
In principle providing a whole buffer full of randomness is useful in many other contexts than cryptographic code. This is particularly true when trying to do things with a fast, non-CS PRNG, e.g. for simulations. This would be a motivation to extend the entire RandomNumberGenerator protocol.
However, extending protocols is tricky, and I don't fully understand the ABI implications here. We could reduce the API cost to nothing by providing a default implementation in terms of next(), of course, but I am unsure of the effects on ABI stability. In order to reduce the scope of this proposal, therefore, I have decided not to propose extending the entire protocol. If the community believes we both can and should do that, I believe it's a fairly straightforward extension of this pitch.
Writing bytes into buffers
One potential cost that can still be incurred here is the cost of a memory copy. It would be nice to be able to ask the system random number generator to write the random bytes into a user-provided buffer of bytes. This improves the performance story further when those random bytes are going to be manipulated by some other library.
However, as SE-0256 (MutableContiguousCollection) was rejected, there is no clear abstraction for what the type would need to be. In principle RangeReplaceableCollection is the right choice, but that is not substantially cheaper than users writing the code themselves as it will still necessitate a memory copy.
The only alternative, without reintroducing SE-0256, would be for this API to accept an UnsafeMutableBufferPointer. The performance gain from this is probably not worth creating such a prominent unsafe API, so until or unless something like SE-0256 lands, or a compelling performance case is made for adding this unsafe API, I elected to hold off for now.