SE-0202: Random Unification

Is there any reason why stateless generators need to become classes? Random?

1 Like

The stateless generators should largely be the default ones inside the standard library. The syntactical overhead of making the stateless RNG's into classes should be made up for by the ease of use elsewhere. Do remember that they would have still been at least structs otherwise, they wouldn't have been free functions.

Wow. Well said, I was going to argue in favor of a PRNG but you convinced me. They should probably be left for external libraries to provide.

1 Like

What is your evaluation of the proposal?

9/10. I think it’s a step in the right direction!

Is the problem being addressed significant enough to warrant a change to Swift?

Yes.

Does this proposal fit well with the feel and direction of Swift?

I can’t say. I don’t have enough API experience with Swift. But crypto randomness is something that fits Swift’s correctness philosophy

If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?

I mostly care about three things here: security, thread-safety and uniformity of implementation. This has two of those points. I think it fares well against other libraries on the same level.

How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

Quick reading.


Fwiw, I’d like to see this implemented directly in the Stdlib. I’d like to see something as important as Randomness to be provided directly from Swift. I don’t like the idea of having the properties of the Random Engine change with the platform it’s run in.

Swift is very fast. Granted, not as fast as it could be with ownership semantics, but that’s coming sooner or later. We could replace the C random libraries with Swift libraries, and go beyond in features and correctness.

But that’s probably better left for a future proposal.

Implementing a whole Random suite would take a lot of work. On the other hand, this proposal is great because it fixes most of the current shortcomings that we need fixed today.

A Swift-based engine can wait. We need to take this step first, and I’m happy with that.

1 Like

I have another question: why is next(upperBound:) implemented using modulo instead of multipliedFullWidth(by: upperBound).high? Is there some issue with using a full width multiplication?

1 Like

This justification should be in the proposal honestly. This is great motivation as to why we cannot go 'all in' on ranges.

I am definitely in support of the proposal.

I believe the Swift standard library should offer the base protocols that have the ability to cover most use cases of randomness when extended. I also think that there should be a single RandomNumberGenerator provided for people to use as the default. This generator should not be able to seeded. Extra generators and distributions etcetera should be provided in a third-party module that may eventually become a sanctioned first-party non-standard library.

Luckily, my beliefs line up with the proposal as written, and I therefore have no blocking objections.

The only thing I would say is that Random.default is not a good name for the standard library provided generator. I think it should be DefaultRandomNumberGenerator or SystemRandomNumberGenerator or similar, with the property called shared.

5 Likes

That sort of implementation detail is not actually relevant to Swift Evolution. The proposal does not specify the algorithm, so it can always be updated in the future if improvements are found.

…that said, I just looked at the implementation for next(upperBound:) and I’m pretty sure it does not work as intended.

The current code
public func next<T: FixedWidthInteger & UnsignedInteger>(upperBound: T) -> T {
  let range = T.max % upperBound
  var random: T = 0
  
  repeat {
    random = self.next()
  } while random < range
  
  return random % upperBound
}

The line “let range = T.max % upperBound” really ought to be something like:

let temp = T.max % upperBound + 1
let range = (temp == upperBound) ? 0 : temp

I was only asking because I always use multipliedFullWidth in mine, and I was worried I was missing a drawback. It was a bit off-topic though, so sorry about that.

I like it overall.

Here are a few things I would like to see added:

  • An additional protocol for seedable/repeatable PRNGs that adds the ability to seed, store and restore a generator's state. That way we can build on them even if we don't know which one is being used.
  • More definition on the expected memory model. e.g. Are the using: parameters inout?
  • The variants of next defined in the protocol itself so they can be overridden by a particular implementation (From the comments, I believe this was the intent)
  • collection.randomElement() would be slightly clearer than collection.random()
  • Bool.random(oneIn: 1_000_000) //Returns true one out of the given UInt times.

Yes, definitely

Yes, the additions are minimal, but provide a good base to build on.

I would really like to see a protocol for seeded/repeatable generators as well, since those are the ones I will use most often, and it would be nice to interoperate with other's code in this area rather than reinventing the wheel for each PRNG...

I have built several random facilities in Swift myself, and this doesn't have as much power, but I am hoping it gives enough standardization that I can utilize the work of others and vice versa.

I like how GameKit allows interchangeable pieces of randomness, and I would like to see that in Swift in general.

I read and participated in the full discussion, and have built lots of randomness code for Swift (mainly for graphical effects which require a repeatable source).

4 Likes

I'm hugely +1 for this proposal. We definitely need to do something like it, and this design seems quite reasonable (though some quibbling over details below).

I'm glad that this isn't trying to solve all the world's RNG API needs, e.g. distributions. Let's get the basics right, then consider building on top of it.

I read the proposal in depth, but did not follow the extensive discussion threads, nor have I read the extensive posts on the review thread.

Questions/suggestions:

// The following are examples on how to get full width integers
let randomInt = Int.random(in: .min ... .max)

shouldn't we provide Int.random() that does that? Later: ah, I see it in alternatives considered. I don't find the argument there strongly motivated. Yes, people could use the wrong thing if they are familiar with C APIs, but the range based API you provide is a lot more convenient than what C provides, and will lead people towards doing the right thing. I don't think the goal of avoiding misuse is significant enough to prevent providing the obvious API. If we don't include random(), it will be a FAQ somewhere because people will keep asking for it and wondering why it isn't included.

The collection API addition seems like it should be named randomElement(...), not random(...) because the other random(...) methods return an instance of Self - in this case, a user could reasonably expect it to return a random collection. I see this in alternatives considered and don't agree with the rationale. This seems like a very confusing name to me, this is unlike "first" which can only plausibly mean one thing.

Other random comments (bad pun :roll_eyes: ):

  • I agree that random functionality shouldn't expose fallibility to the user just to support the weird case where /dev/urandom can't be accessed.

  • "Why not make the default rng non-secure?" - I agree with the rationale, but shouldn't we provide a "Random.insecure" implementation that provides a faster implementation that isn't worried about crypto security? I agree with safe default, but that doesn't mean we should underserve the performance sensitive case.

Overall, thank you Alejandro so much for pushing this forward. I know that this has been an epicly long quest, and I'm super thankful that you're championing the effort!!

-Chris

6 Likes

That’s great feedback - can you please make a comment to that effect on the implementation?

I'm concerned that this is the only reply in the thread to Jens' post. The substance of his posting indicates that the proposal doesn't address important parts of the problem or ensure that they can be addressed in future. I haven't yet evaluated the proposal to see if I agree with Jens' assessment, but if he's wrong I'd like to see a vigorous rebuttal here.

1 Like

Some playground experiments suggest Jens is quite correct:

/// From the proposal implementation
func algo0(_ bitPattern: UInt64) -> Double {
    return Double(
        sign: .plus,
        exponentBitPattern: (1 as Double).exponentBitPattern,
        significandBitPattern: bitPattern
    ) - 1
}

/// Jens’s proposed alternative
func algo1(_ bitPattern: UInt64) -> Double {
    return Double(bitPattern) * Double.ulpOfOne / 2
}

// algo0 generates the number in the interval 1..<2, which loses
// one bit of significand precision. When we subtract 1 to move it
// to the interval 0..<1, we can see the significand stepping by 2
// for values slightly below 1.0 (which will have the lowest
// precision of any in the output range):

algo0((1 << 52) - 1).significandBitPattern  // 4503599627370494
algo0((1 << 52) - 2).significandBitPattern  // 4503599627370492

// algo1 claims to cover twice as many possible values, and indeed,
// we can see the resulting bit pattern stepping by 1 just below 1.0:

algo1((1 << 53) - 1).significandBitPattern  // 4503599627370495
algo1((1 << 53) - 2).significandBitPattern  // 4503599627370494

// We can verify that algo0 is doubling the precision of algo1 by
// checking that one step below 1.0 for algo0 is two steps for algo1:

algo0((1 << 52) - 1) == algo1((1 << 53) - 2)  // true

// ...and for conclusive evidence that algo1 is as dense as any
// algorithm that generates evenly spaced numbers can possiby be:

algo1((1 << 53) - 1) == 1.0.nextDown  // true

However, this an implementation detail; I don’t see that it fundamentally undermines the proposal’s design in any way.

4 Likes

I believe I have an updated implementation utilizing your method. I will hopefully push the changes tomorrow. If you have any more concerns, please post a comment on the implementation. Feedback greatly appreciated!

1 Like

I wasn't writing explicitly in reply to Jens, but the playground I posted is meant to demonstrate how different kinds of generators and distributions can be built on top of the functionality in this proposal. In particular, it includes several distributions modeled as infinite sequences, including a Gaussian/normal distribution, a couple different uniform distributions, and a discrete distribution that (I think) really shows off the composability of the proposed APIs tied to collection types.

While this proposal doesn't attempt to go as far with its random capabilities as the C++ STL, in what it does tackle it is far easier to use and it lays the foundation for different kinds of distributions in third-party libraries or for standard library adoption with future proposals.

1 Like

Quick review manager note: I just want to reinforce this point. If there is a concern about the candidate implementation, but not one that alters anything about the proposed API, we should discuss that on a separate thread on the standard library developer forum rather than in the review of the API.

Of course, if you think the implementation issue is caused by the API itself, that's appropriate for the review discussion.

2 Likes

As we're approaching the end of the review period, I wanted to try and focus the discussion in the last few days.

There's been a lot of great posts about whether the standard library should include other features, such as different random number distributions, or other random sources such as a seeded generator. These features may well be suitable as additions to the standard library, and can be considered as future proposals building on this one if it is accepted, but shouldn't be considered requirements for accepting/rejecting this proposal itself.

What is important is that there is confidence that the API as proposed can accommodate them. Any feedback on ways in which the APIs could prove problematic for integrating different kinds of distributions or generators would be very much appreciated. In assessing this, I'd urge everyone to take a look at Nate's playground demonstrating some of the ways this can be done with this API. In hindsight, it seems like the proposal could have included some of this material, and maybe we can retrospectively integrate some of it, but I don't think we need to make accepting the proposal contingent on that.

There are also some other topics that have been brought up, and for which additional justifications for or against what's proposed would be beneficial in finalizing the review:

  • the naming of Collection.random() versus randomElement() (see the rationale here)
  • whether that method should return an optional vs trap on empty, and relatedly whether, if it trapped, the static methods like Int.random(in:) would still be necessary.

(note, this is a call for rationales for a particular choice, rather than just a vote for a particular style)

Thanks to everyone for participating so far,
Ben

6 Likes

I'm working on an example implementation with a free Random(in:using:) function and an AnyObject constraint on RandomNumber generator, just for comparison purposes.

A nice side effect of the AnyObject constraint is that none of the functions need to be generic over the generator, and I can include a default value in all of them (which I'm assuming didn't work otherwise, since the playground implemented the default value with a manual override).

Hopefully I'm able to provide my justification as best as possible here for the points you mention.

I'm certainly open to the idea (not attached to either spelling). We must also understand that this rename also affects the range based API. (0 ..< 10).randomElement(). Spellings like this, in my opinion, gives the static T.random(in:) even more justification as randomElement() is semantically tied to collections. While Ranges can conditionally conform to become collections, I fear this spelling could hurt the meaning of what it means to get a random number from a range. I agree, this spelling makes a lot of sense for collections, but probably only collections.

I agree with where this line of thinking is coming from, but to me, it seems reasonable that a user should expect that there would a difference in the API when one is a static method vs an instance method.

Again, I don't have feelings towards one spelling or the other. I'm simply providing my rationale for naming the method the name it has now.

In regards to the second point, I think many others have already given great rationale for why this should return an optional. Requiring that this method trap would, in my opinion, cause more harm than good. Literal arrays are the exception in that you know for sure that the array is not empty, but it's very easy to not know what the arrays contents are at runtime. This results in this sort of boilerplate code where you're always checking if the array is empty before calling the random method. I feel Donald summarizes this perfectly.

Not only that rationale, but because there's already a precedent set by other facilities such as min() and max(). I keep bringing these back up because I believe a user could ask such questions like, "Why does min() and max() return optionals, but random() doesn't?" I'm a huge fan of consistency, and aligning random() with these other APIs seems to fit rather nicely.

In the case we do make this trap, is the static methods like Int.random(in:) still necessary? Yes.

If the naming of this method was randomElement(), one could make the argument that floating point ranges need to be consistent (huge fan of btw) with ranges that conditionally conform to collection. To me, it doesn't make sense to get a random element from a range that isn't a collection.
(1.0 ..< 10.0).randomElement() doesn't make sense because there is no element to select from. This justifies the static T.random(in:) methods which doesn't cause this confusion.

Now, if the naming of this method were named random() however, in the alternatives considered I wrote, Having a range as the primary spelling makes it fairly simple to get a random number from. Ranges are also very desirable because it doesn't encourage modulo bias. I still stand by this, but Nate brings up a great point that this design simply doesn't hold for other types that want to incorporate random functionality. Data.random(byteCount:using:) simply cannot be expressed so simply like a range. It's my mistake to not have mentioned this somewhere in the proposal, but I'm glad we're getting a chance to discuss it now.

I apologize if the proposal doesn't go so in depth in terms of my justification, but like I said, really glad that we get to discuss it now.

1 Like

I agree that the protocol should be constrained to class types, but having the generator not be generic would prevent the optimizer from specializing and inlining it, and I'd say efficiency is very important here.


And more generally, I still think that there hasn’t been nearly enough (hardly any?) actual practical usage/testing/experimentation using the proposed API in order to know how, or even whether or not, it can be successfully extended with custom (seedable) PRNGs, and eg Gaussian distributions etc, and IMHO it sounds like a lot of people here are OK with just assuming that this API is a good starting point and that everything will just work out even if we haven't actually tried to do anything real with it (ie try to use it for some little game with procedurally generated graphics and levels, or a little raytracer (which could use 2d gaussian values for multii-sampling) or something).

I haven’t looked into the playground of @nnnnnnnn which seems to be the only effort that has been made in this direction. But for example, it's impossible to measure efficiency in a playground settings.

My guess/feeling is that I will have to continue using my own little Random library (that I have been using, testing and modifying for a couple of years) even If this proposal is accepted. My requirements are PRNG, and the generators and the stuff based on them (conversions into various floating point ranges, other distributions, etc) have to be as fast as is possible, and I've sometimes needed different generators for different scenarious, sometimes trading speed for quality, perhaps a generator with a state that is of suitable size for a particular task, etc. I have never needed a non-seedable cryptographically secure RNG, but I understand that there might be others that have, and I understand the rationale of having it be the default. My use cases have been graphics/audio/signal processing (which would include games with procedurally generated graphics, levels and stuff like that).

I don’t have the time to dig deeper in this (trying out and/or providing example code etc) but maybe these negative ramblings of mine can provide some sort of non-negative value anyway.

1 Like