Random Data(): UInt8.random or SecRandomCopyBytes

tl;dr I want Data.random(length:) but it isn't in Foundation. Looking around most of the solutions use SecRandomCopyBytes but that seems unSwifty.

What do you think of this pure Swift implementation?

extension Data {
    /// Returns cryptographically secure random data.
    ///
    /// - Parameter length: Length of the data in bytes.
    /// - Returns: Generated data of the specified length.
    static func random(length: Int) throws -> Data {
        return Data((0 ..< length).map { _ in UInt8.random(in: UInt8.min ... UInt8.max) })
    }
}

For anyone interested, I tracked down SecRandomCopyBytes to here SecFramework.c:

int SecRandomCopyBytes(SecRandomRef rnd, size_t count, uint8_t *bytes) {
    if (rnd != kSecRandomDefault)
        return errSecParam;
    pthread_once(&kSecDevRandomOpen, SecDevRandomOpen);
    if (kSecRandomFD < 0)
        return -1;
    while (count) {
        ssize_t bytes_read = read(kSecRandomFD, bytes, count);
        if (bytes_read == -1) {
            if (errno == EINTR)
                continue;
            return -1;
        }
        if (bytes_read == 0) {
            return -1;
        }
        bytes += bytes_read;
        count -= bytes_read;
    }
    return 0;
}

UInt8.random(in:) uses the SystemRandomNumberGenerator which isn’t guaranteed to be cryptographically secure. So your implementation is fine, but the documentation is wrong – the function cannot be known to return cryptographically secure data. If you need that, use SecRandomCopyBytes instead.

1 Like

From the [SystemRandomNumberGenerator ] page:(SystemRandomNumberGenerator | Apple Developer Documentation)

[...] uses a cryptographically secure algorithm whenever possible.

Eep!

Could something like the following be a good way to bridge the gap for now?

#if canImport(CoreFoundation)
struct SecRandomNumberGenerator: RandomNumberGenerator {
    func next() -> UInt64 {
        let size = MemoryLayout<UInt64>.size
        var data = Data(count: size)
        return data.withUnsafeMutableBytes {
            guard 0 == SecRandomCopyBytes(kSecRandomDefault, size, $0.baseAddress!) else { fatalError() }
            return $0.load(as: UInt64.self)
        }
    }
}
#endif

The Data extension would then be:

extension Data {
    /// Returns cryptographically secure random data.
    ///
    /// - Parameter length: Length of the data in bytes.
    /// - Returns: Generated data of the specified length.
    static func random(length: Int) -> Data {
        var randomNumberGenerator = SecRandomNumberGenerator()
        return Data((0 ..< length).map { _ in UInt8.random(in: UInt8.min ... UInt8.max, using: &randomNumberGenerator) })
    }
}

You can also skip the intermediate creation of a Data instance and write random bytes into a UInt64 directly:

struct SecRandomNumberGenerator: RandomNumberGenerator {
    func next() -> UInt64 {
        var bytes: UInt64 = 0
        let result = withUnsafeMutableBytes(of: &bytes, { buffer in
            SecRandomCopyBytes(kSecRandomDefault, buffer.count, buffer.baseAddress!)
        })
        
        guard result == errSecSuccess else {
            // Figure out how you'd prefer to deal with this.
            fatalError()
        }
        
        return bytes
    }
}

In general, if you require cryptographically secure random data, Apple platforms have a few options:

  • The getentropy(2) syscall which grabs random bytes directly from the kernel CSPRNG
  • Reading bytes from /dev/random, which is seeded directly by the same kernel CSPRNG
  • arc4random(3)/arc4random_buf (sufficient to import Darwin/import CoreFoundation/import Foundation)
  • CCRandomGenerateBytes() (import CommonCrypto)
  • SecRandomCopyBytes() (import Security/import CoreFoundation/import Foundation)

From the getentropy(2) man page:

However, it should be noted that getentropy() is primarily intended for use in the construction and seeding of userspace PRNGs like arc4random(3) or CC_crypto(3). Clients who simply require random data should use arc4random(3), CCRandomGenerateBytes() from CC_crypto(3), or SecRandomCopyBytes() from the Security framework instead of getentropy() or random(4)

The latter 3 approaches are recommended. And although this man page doesn't call it out directly, the man page for arc4random(3) states that

These functions use a cryptographic pseudo-random number generator to generate high quality random bytes very quickly.

In essence, arc4random_buf, CCRandomGenerateBytes and SecRandomCopyBytes will all look pretty identical to use, so it's up to you.

Of course, all of this is non-portable, so if you only care about Apple platforms anyway, you can stick to SystemRandomNumberGenerator, which is guaranteed to produce cryptographically secure random data via arc4random_buf(3).

2 Likes

Is that really advisable? The standard library isn’t compiled with your binary on Apple platforms, so technically there’s nothing stopping the implementation of SystemRandomNumberGenerator from changing under your feet, is there? (From what I could see the implementation isn’t always inlined.)

1 Like

It depends entirely on the specifics of your need for cryptographically-secure randomness.

  • If you're worried that the stdlib will change its implementation in the future to an implementation which is not cryptographically secure: this guarantee isn't something that can reasonably be retracted. See, for instance, the stdlib hesitancy to mark its implementation of sort() as a stable sort:

    /// The sorting algorithm is not guaranteed to be stable. A stable sort
    /// preserves the relative order of elements that compare equal.

    In practice, the stdlib sorting algorithm in use (TimSort) has been a stable one for years now, and there's been much discussion in the past about making this guarantee — but, once this guarantee is made, you can't take it back, because clients will rely on this fact. (And, it's entirely possible that clients already implicitly rely on this fact, even if explicitly documented otherwise, because it's been true in practice for so long.)

    OTOH, the cryptographic security of SystemRandomNumberGenerator is documented and advertised, so going back on this API promise would be a pretty big deal. Could it, in theory? Sure, it's possible, but a situation that would warrant needing to change SystemRandomNumberGenerator from an already highly-performant cryptographically secure implementation to one that isn't cryptographically secure seems exceptional.

  • If you're worried about an attack vector that replaces the stdlib underneath you via dynamic linking or similar such that you can no longer rely on the results of SystemRandomNumberGenerator being cryptographically secure, then yes, you may want to rely on your own random generator. (Though in this case, you may have other concerns too.)

Either way, if you're really concerned, it doesn't take much to roll your own generator; there just shouldn't be a need to.

4 Likes

This sentence from the SystemRandomNumberGenerator documentation is concerning:

[...] uses a cryptographically secure algorithm whenever possible.

It's just not guaranteed :frowning:

I think "whenever possible" is a per-platform guarantee, not a "sometimes yes sometimes no". I think the section below that is relevant:

Platform Implementation of SystemRandomNumberGenerator

While the system generator is automatically seeded and thread-safe on every platform, the cryptographic quality of the stream of random data produced by the generator may vary. For more detail, see the documentation for the APIs used by each platform.

  • Apple platforms use arc4random_buf(3) .
  • Linux platforms use getrandom(2) when available; otherwise, they read from /dev/urandom .
  • Windows uses BCryptGenRandom .
  • arc4random_buf(3) is cryptographically secure
  • BCryptGenRandom is cryptographically secure, best as I can tell from documentation
  • getrandom(2) is cryptographically secure
    • If getrandom(2) is unavailable, reads from /dev/urandom which on Linux is supposed to produce cryptographically-secure data

Random number generation is much more complicated on Linux than on macOS and Windows, with a lot of back-and-forth historically between various APIs. If a Linux distro neither offers getrandom(2) and produces non-cryptographically-secure data from /dev/urandom, then yes, SystemRandomNumberGenerator might not produce cryptographically-secure results. In practice, on most well-behaving platforms which can guarantee a CSPRNG, a CSPRNG will be used; the promise here wavers only for Linux for historical reasons.

However, it depends entirely on your needs. If you're targeting Apple platforms exclusively, then my above explanation applies: given that this generator is publicly-documented to use an already performant CSPRNG, it seems exceptional to need to change to a different generator which isn't cryptographically secure; but, if this definition doesn't meet your needs, then by all means, roll your own.

2 Likes

I think I’ve always interpreted the wording as less normative and more descriptive (”here’s what the implementation does today on these platforms, but that’s subject to change”). But you’re probably right that in practice, the implementation is extremely unlikely to change, Hyrum’s law and all.

This is a little bit misleading; SecRandomCopyBytes is platform-specific, and SRNG is guaranteed to be a cryptographically secure PRNG on the platforms that have SecRandomCopyBytes (and other existing targets as well; it would only not be expected to be a CSPRNG on a platform where there is no system-provided CSPRNG available).

This policy is emphatically not subject to change.

3 Likes

I should note however that there are large performance advantages to using SecRandomCopyBytes or arc4random_buffer instead of repeatedly calling Random(in: .min ... .max); this is something that I expect that we will fix by adding a bulk-random API on SRNG and a fill-array method, but I haven't had a chance to walk that through evolution yet.

1 Like

That is definitely a stronger guarantee than how I read the docs! :+1:

Has there been any talk of adding a SecureRandomNumberGenerator with a failable initializer that returns nil on those platforms where there is no system CSPRNG?

@Karoy_Lorentey and I have discussed this a little bit. One problem is that when you use SRNG, you generally do not explicitly call the initializer; e.g. most uses look like: Int8.Random(in: 0 ... 10), where there is no option to fallback on something else. I think that this API design problem is solvable, but will require some thought.

2 Likes

I did consider doing the following but decided I didn't want to write a library:

extension Array where Element: FixedWidthInteger {
    public static func random<T>(count: Int, in range: Range<Element>, using generator: inout T) -> [Element] where T : RandomNumberGenerator
    ...
}

It would be a nice addition to the standard library, but not my repo.

For anyone interested, I tracked down SecRandomCopyBytes to here …

FYI, that’s a really old implementation. It switched to Common Crypto a while back [1]. Here’s a link to the macOS 12.2 aligned version.

Share and Enjoy

Quinn “The Eskimo!” @ DTS @ Apple

[1] The furthest back I can easily check is 10.13, and it uses Common Crypto there.

4 Likes

I understand that arc4random_buf and CCRandomGenerateBytes call into the same corecrypto function, however, they don't handle that function's return value the same way.

arc4random_buf ignores the return value from corecrypto, while CCRandomGenerateBytes maps it to either kCCSuccess or kCCRNGFailure.

Is it safe for arc4random_buf to ignore corecrypto's return value?

@eskimo @scanon @itaiferber

Having not looked at the implementation at all — arc4random_buf has a void return type, and the arc4random functions are documented as

RETURN VALUES
     These functions are always successful, and no return value is reserved to indicate an error.

I'm not sure what, if anything, arc4random_buf would really be able to do on failure.