SE-0202: Random Unification

davedelong · April 13, 2018, 2:42am

Hooray!

benrimmington · April 13, 2018, 5:31am

Generators as reference vs value types

The problem with using value types is that it's easy to forget the inout modifier. Hopefully, this would be discovered during testing. But my preferences are:

RandomNumberGenerator should be a class protocol, or
Random (the default source) should be an open class.

If you need to copy or fork a generator, this could be achieved by making it Codable to hide the details. For example, two of the generators in GameplayKit use UInt64 seeds, but its ARC4 generator uses a Data seed.

(There's already a GameplayKit overlay where RandomNumberGenerator conformances could be added).

(Joe mentioned thread safety, but how does C++ ensure this with its <random> APIs? Swift is currently using a shared std::mt19937 value which isn't thread-safe AFAIK).

Static versus free functions

I like Brent's idea of making random(in:using:) a free function.

The random() function in <stdlib.h> is already unavailable in Swift, with the error message: "Use arc4random instead".

Jens · April 13, 2018, 7:25am

Another area which I think needs further discussion (unless I'm missing something) is that the proposal doesn't mention any protocol requirements concerning the state or seeding of PRNGs, ie:

// The proposal currently only has:
protocol RandomNumberGenerator : AnyObject {
    func next() -> UInt64
}

// But I think it should also have (or at least mention) something like:
protocol PseudoRandomNumberGenerator : RandomNumberGenerator {
    associatedtype State
    init?(state: State) // <-- Failable since eg Xoroshiro128+ can't have a state which is everywhere zero.
    init(seed: UInt64) // <-- All PRNG types must be seedable by an UInt64, and each unique UInt64 value must result in a unique state ... effectively putting a min 64 bit state requirement on supported PRNGs... this requires more discussion ...
}

While I agree that the default generator should be cryptographically secure (and a non-seedable RNG), I think many use cases will need seedable, stateful PRNGs, and I don't see the point of leaving the API for their state and seeding unrestricted and undiscussed.

If the idea is to separate the PRNG-part into an additional proposal, I think it would be better to discuss it now and include it in this proposal, in order to ensure that the RNG-part (having otherwise been designed in isolation) doesn't end up negatively impacting the design of the PRNG-part.

jamieb · April 13, 2018, 8:14am

I disagree with random() being a free function:

random(in: 0 ..< 10)
// vs
random(0 ..< 10)
// vs
Int.random(in: 0 ..< 10)

The last one just feels more swifty, and is more readable. As for discoverbaility, I do not think that would be an issue with having it as a static function on T; many common extensions to add random functionality add static methods to types, because this is nicer, and more readable syntax.

Not to mention, this naming would be inconsistent if a user implemented random functionality on their own types:

random(in: 0 ..< 10)
Data.random(byteCount: 10)
Rectangle.random()

One doesn’t fit in...

As for the “more natural to swift” argument; I completely disagree that it is more natural. The general sentiment around swift users is that the bottom one is more swifty...

[Removed the poll that was linked here, since it was unintentionally biased, and there were not that many replies]

+1 to renaming to .randomElement, no opposition of that.

Joe_Groff · April 13, 2018, 3:53pm

If the requirements on the protocol that generate values are mutating, you would almost definitely notice this at compile time, since you wouldn't be able to get random numbers from a non-inout argument.

That's an implementation detail used for the standard library's own purposes, and AIUI would not be used to implement the proposed API. If anything, the standard library could be updated to use its own randomness API once it has one instead of relying on C++ code.

Alejandro · April 13, 2018, 4:02pm

The correct option right now is to require generators be classes. However, this feels like a band-aid for the short term. Joe makes some great points in that the correct option for the long term is value type generators. I think developers who need to reach for RandomNumberGenerator would hopefully have knowledge over the difference between reference and value types. Building for the correct option now seems infinitely better over band-aiding the short term problem and living with it. Even if the APIs are based around value type generators, that shouldn't stop developers from wanting to implement RandomNumberGenerator for a class. I'm a big +1 to take inout generators instead of constraining them to classes now.

nnnnnnnn · April 13, 2018, 4:20pm

The current design has a thread-safe global default generator. Can we safely pass that as inout from multiple threads simultaneously? Or would just the fact of taking "write" ownership of that resource be an exclusivity violation? (Thread safety is managed inside the next() method, at the point of state mutation.)

Joe_Groff · April 13, 2018, 4:26pm

The global generator is effectively stateless, so you can always pass an independent "copy" of it into a generic function.

nnnnnnnn · April 13, 2018, 5:20pm

So the code for that would be something like this? A bit less convenient than the current way of calling, but hopefully this will mostly happen inside generator-parameter-less overloads. Not the end of the world.

var generator = Random.default
let myValue = random(1...10, using: &generator)

I'm nervous about the prospect of bugs caused by accidentally treading over the same section of random data when people are using PRNGs. But in running some tests to quantify the cost of indirection, it does look like we'd be limiting ourselves if we require generators to be classes:

arc4random (2 calls)             1.1841
arc4random_buf                   1.9345
Xorshift128Plus (struct)         0.1827
Xorshift128Plus (class)          0.8127
ThreadLocking<Xorshift128Plus>   1.3473
LinearCongruential (struct)      0.1819
LinearCongruential (class)       0.5484
ThreadLocking<LinearCongruenti   1.0569

Caveat: I may have built this test poorly. You can see the source of it here: randomTestingVC.swift · GitHub

Joe_Groff · April 13, 2018, 5:25pm

You could also implement Random.default as a computed property that always vends a fresh generator (since all such values are equivalent):

extension Random {
  var `default`: DefaultRandomNumberGenerator {
    get { return DefaultRandomNumberGenerator() }
    set { /*discard*/ } 
  }
}

Alejandro · April 13, 2018, 6:13pm

When I initially pitched this idea, my (very) early design made use of free functions. I remember during the pitch phase that many of the initial feedback was against free functions. Many people pitched their design ideas for getting a random number, some including (0 ..< 10).random(). I took this feedback and pitched Int.random(in: 0 ..< 10). From there, we discussed if this was necessary if we had (0 ..< 10).random(). Ultimately, I think we came into agreement that Collection.random() should return optional and that the static functions were justified to provide the non-failable API. I would have hoped that discussion over this topic would have happened during pitch phase so that we could have really ironed out the details, but from what I've heard during pitch phase and the review, many support the static functions. I disagree over the discoverability argument because based off a lot of feedback from pitch and review phase, many are reaching for this static function that doesn't exist.

This may be true on the Darwin platform, but for Linux this is not the case.

import Glibc // Or Foundation
let num = random()

Sure, this requires an import of Glibc (or Foundation) in order to reveal random(), but I would hate to see someone assume that is part of the family of functions that Swift is aiming to provide. The only real argument I see for the free functions is the term-of-art justification. While I can not disagree that this is precedent in many other languages, I can disagree that this is not the correct design for Swift. As I've seen in the pitch phase and most of the review phase, many are reaching for this static function. One of the goals of this proposal was to introduce a Swifty way of getting random numbers. I've seen many complaints that arc4random() or random() aren't completely Swifty. They don't feel like a part of the language even though they may be precedent in others.

I want to take time to discuss Bool.random() for a moment. The goal of this static method was to provide clarity at the call site. Many of the alternatives pitched Int.random(in: 0 ... 1) == 0 or even [true, false].randomElement()! are not initially clear to beginners what their purpose is. I believe for the experienced developer, this might come faster than others, but for those just beginning to program, these alternatives can be a head scratcher. Bool.random() is both clear at what it does, and feels Swifty. If the direction is to move to free functions, this imbalance of getting random numbers and random booleans feels incorrect. One could argue that the solution here is to add another free function, randomBool(). Sure, we could keep using this design for the most fundamental types in the stdlib, but this can very quickly set a precedent on the rest of the language. I would hate to see something like an import to Foundation quickly reveal many global random functions because they wanted to follow the precedent set by the free functions in the stdlib.

So yes, I still agree the static functions are the best spelling for Swift. I think they're just as discoverable as the free functions, more Swifty than the free functions, and sets a better precedent for the rest of the language than free functions.

I agree randomElement improves readability and reveals clarity at the call site.

I disagree with having some platforms be a secure RNG and others that can't provide this be an insecure RNG. I agree with others that the best solution here is to mark platforms that we know can't provide a secure number as unavailable.

jarod · April 13, 2018, 6:16pm

Just want to +1 the static version of random instead of the free function. A free function feels to me like an imported C API, while the static version feels like it follows typical Swift conventions. If we do go with the free function, I'd at least like to keep the in label so that it reads well.

pvieito · April 14, 2018, 2:33pm

Generators as reference vs value types

I think we should require Generators to be classes. It simplifies calling the functions and I don't see any clear advantage of using struct Generators.

Static versus free functions

I have mixed feeling about this one.

Type.random(in:) feels more swifty for me.
random(in: Range<T>) seems more easy to find for all types.

In any case I would maintain the argument label.

Naming of Collection.random

I agree that we should name it Collection.randomElement and it should return an optional value.

benrimmington · April 15, 2018, 5:49am

@Joe_Groff, would the move-only approach cause any problems for large generators?

e.g. sizeof(std::knuth_b) is 1032 bytes
e.g. sizeof(std::mt19937) is 2504 bytes

jamieb · April 15, 2018, 11:25am

Surely we should be naming new API's in accordance to their readability and swiftness, rather than discoverability? I didn't understand why using a free function because it may be easier to find is even a valid point from the team...

Panajev · April 15, 2018, 11:28am

I swear if I hear Swifty/Swiftiness one more time... lol.

I know it is wordier, but “consistent with the language direction” sounds much less like a “(no)True Scotsman” argument to me ;).

https://en.wikipedia.org/wiki/No_true_Scotsman

jamieb · April 15, 2018, 11:33am

Fair enough ¯\_(ツ)_/¯.

It's the same thing to me.

Joe_Groff · April 16, 2018, 4:10pm

That's a valid concern. While we've been making improvements to the compiler to reduce the number of implicit copies it performs on values, it's still far from ideal. I think that the fact that a value type random number generator's primary APIs would be mutating, and therefore RNGs are likely to be primarily passed around inout, would mitigate that concern somewhat, since passing around an inout argument is far less likely to induce implicit copies. Absent move-only value enforcement, large stateful RNGs could be COW-boxed to reduce the cost of such implicit copies as well.

Jon_Hull · April 17, 2018, 2:59am

It sounds to me like requiring it to be a class prevents a bunch of problems. I'm not sure about move-only though.

I am strongly in favor of static functions because they are more extensible. Types (e.g. data) which want to add their own random methods can mirror the basic .random(in: ) form. We really don't want things like colors polluting the main free function namespace.

Also, users will have learned to look for static functions named random when looking on additional types. I feel like that is an easier to learn pattern overall. If some are free functions and some are static functions, that will obviously be a point of confusion. If they are all free functions, it will pollute the namespace.

I get that free functions work well for the basics, but we also need to consider how people will build upon the system. This proposal is meant to be the foundation of our random system, to be built on by third parties.

felix91gr · April 17, 2018, 4:46am

I agree. Static functions are better for this use case.