[Proposal] Random Unification

A few quick thoughts:

I know that there's been some discussion that `(1...10).random` is the best
spelling, but I'd like to push back on that suggestion. When I want a
random number, I tend to think of the type I want first ("I want a random
integer") and then a range ("I want a random integer between a and b"), not
the other way around. My intuition is that `Int.random(in:)` will be more
discoverable, both on that basis and because it is more similar to other
languages' syntax (`Math.random` in JavaScript and `randint` in NumPy, for
example). It also has the advantage that the type is explicit, which I
think is particularly useful in this case because the value itself is,
well, random.

I would also argue that, `random` is most appropriately a method and not a
property; there's no hard and fast rule for this, but the fact that the
result is stochastic suggests (to me) that it's not a "property" of the
range (or, for that matter, of the type).

I would reiterate here my qualms about `Source` being the term used for a
generator. These types are not a _source_ of entropy but rather a
_consumer_ of entropy.

`UnsafeRandomSource` needs to be renamed; "unsafe" has a specific meaning
in Swift--that is, memory safety, and this is not it. Moreover, it's
questionable whether this protocol is useful in any sense. What useful
generic algorithms can one write with such a protocol?

`XoroshiroRandom` cannot be seeded by any `Numeric` value; depending on the
specific algorithm it needs a seed of a specific bit width. If you default
the shared instance to being seeded with an `Int` then you will have to
have distinct implementations for 32-bit and 64-bit platforms. This is
unadvisable. On that note, your `UnsafeRandomSource` needs to have an
associated type and not a generic `<T : Numeric>` for the seed.

The default random number generator should be cryptographically secure;
however, it's not clear to me that it should be device random.

I agree with others that alternative random number generators other than
the default RNG (and, if not default, possibly also the device RNG) should
be accommodated by the protocol hierarchy but not necessarily supplied in
the stdlib.

The term `Randomizable` means something specific which is not how it's used
in your proposed protocol.

There's still the open question, not answered, about how requesting an
instance of the hardware RNG behaves when there's insufficient or no
entropy. Does it return nil, throw, trap, or wait? The proposed API does
not clarify this point, although based on the method signature it cannot
return nil or throw. Trapping might be acceptable but I'd be interested to
hear your take as to why it is preferable.

Ā·Ā·Ā·

On Sun, Nov 5, 2017 at 4:43 PM, Alejandro Alonso via swift-evolution < swift-evolution@swift.org> wrote:

For the proof of concept, I had accidentally deleted that one. I have a
more up to date one which was discussed a few weeks later.
Swift Random Unification Design Ā· GitHub

- Alejandro

On Nov 5, 2017, 4:37 PM -0600, Jonathan Hull <jhull@gbis.com>, wrote:

Is there a link to the writeup? The one in the quote 404s.

Thanks,
Jon

On Nov 5, 2017, at 2:10 PM, Alejandro Alonso via swift-evolution < > swift-evolution@swift.org> wrote:

Hello once again Swift evolution community. I have taken the time to write
up the proposal for this thread, and have provided an implementation for it
as well. I hope to once again get good feedback on the overall proposal.

- Alejandro

On Sep 8, 2017, 11:52 AM -0500, Alejandro Alonso via swift-evolution < > swift-evolution@swift.org>, wrote:

Hello swift evolution, I would like to propose a unified approach to
`random()` in Swift. I have a simple implementation here
https://gist.github.com/Azoy/5d294148c8b97d20b96ee64f434bb4f5\. This
implementation is a simple wrapper over existing random functions so
existing code bases will not be affected. Also, this approach introduces a
new random feature for Linux users that give them access to upper bounds,
as well as a lower bound for both Glibc and Darwin users. This change would
be implemented within Foundation.

I believe this simple change could have a very positive impact on new
developers learning Swift and experienced developers being able to write
single random declarations.

I’d like to hear about your ideas on this proposal, or any implementation
changes if need be.

- Alejando

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

The proposal and implementation have the current updated API. The link I sent Jon was the one I brought up a few weeks ago which is outdated now. The proposal answers all of your questions. As for `.random` being a function, some would argue that it behaves in the same way as `.first` and `.last` which are properties.

- Alejandro

Ā·Ā·Ā·

On Nov 5, 2017, 6:07 PM -0600, Xiaodi Wu <xiaodi.wu@gmail.com>, wrote:
A few quick thoughts:

I know that there's been some discussion that `(1...10).random` is the best spelling, but I'd like to push back on that suggestion. When I want a random number, I tend to think of the type I want first ("I want a random integer") and then a range ("I want a random integer between a and b"), not the other way around. My intuition is that `Int.random(in:)` will be more discoverable, both on that basis and because it is more similar to other languages' syntax (`Math.random` in JavaScript and `randint` in NumPy, for example). It also has the advantage that the type is explicit, which I think is particularly useful in this case because the value itself is, well, random.

I would also argue that, `random` is most appropriately a method and not a property; there's no hard and fast rule for this, but the fact that the result is stochastic suggests (to me) that it's not a "property" of the range (or, for that matter, of the type).

I would reiterate here my qualms about `Source` being the term used for a generator. These types are not a _source_ of entropy but rather a _consumer_ of entropy.

`UnsafeRandomSource` needs to be renamed; "unsafe" has a specific meaning in Swift--that is, memory safety, and this is not it. Moreover, it's questionable whether this protocol is useful in any sense. What useful generic algorithms can one write with such a protocol?

`XoroshiroRandom` cannot be seeded by any `Numeric` value; depending on the specific algorithm it needs a seed of a specific bit width. If you default the shared instance to being seeded with an `Int` then you will have to have distinct implementations for 32-bit and 64-bit platforms. This is unadvisable. On that note, your `UnsafeRandomSource` needs to have an associated type and not a generic `<T : Numeric>` for the seed.

The default random number generator should be cryptographically secure; however, it's not clear to me that it should be device random.

I agree with others that alternative random number generators other than the default RNG (and, if not default, possibly also the device RNG) should be accommodated by the protocol hierarchy but not necessarily supplied in the stdlib.

The term `Randomizable` means something specific which is not how it's used in your proposed protocol.

There's still the open question, not answered, about how requesting an instance of the hardware RNG behaves when there's insufficient or no entropy. Does it return nil, throw, trap, or wait? The proposed API does not clarify this point, although based on the method signature it cannot return nil or throw. Trapping might be acceptable but I'd be interested to hear your take as to why it is preferable.

On Sun, Nov 5, 2017 at 4:43 PM, Alejandro Alonso via swift-evolution <swift-evolution@swift.org<mailto:swift-evolution@swift.org>> wrote:
For the proof of concept, I had accidentally deleted that one. I have a more up to date one which was discussed a few weeks later. Swift Random Unification Design Ā· GitHub

- Alejandro

On Nov 5, 2017, 4:37 PM -0600, Jonathan Hull <jhull@gbis.com<mailto:jhull@gbis.com>>, wrote:
Is there a link to the writeup? The one in the quote 404s.

Thanks,
Jon

On Nov 5, 2017, at 2:10 PM, Alejandro Alonso via swift-evolution <swift-evolution@swift.org<mailto:swift-evolution@swift.org>> wrote:

Hello once again Swift evolution community. I have taken the time to write up the proposal for this thread, and have provided an implementation for it as well. I hope to once again get good feedback on the overall proposal.

- Alejandro

On Sep 8, 2017, 11:52 AM -0500, Alejandro Alonso via swift-evolution <swift-evolution@swift.org<mailto:swift-evolution@swift.org>>, wrote:
Hello swift evolution, I would like to propose a unified approach to `random()` in Swift. I have a simple implementation here https://gist.github.com/Azoy/5d294148c8b97d20b96ee64f434bb4f5\. This implementation is a simple wrapper over existing random functions so existing code bases will not be affected. Also, this approach introduces a new random feature for Linux users that give them access to upper bounds, as well as a lower bound for both Glibc and Darwin users. This change would be implemented within Foundation.

I believe this simple change could have a very positive impact on new developers learning Swift and experienced developers being able to write single random declarations.

I’d like to hear about your ideas on this proposal, or any implementation changes if need be.

- Alejando

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org<mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org<mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

Just skimmed through the updated proposal and am weighing in with my naĆÆve
opinions:

   - I’m still highly skeptical of a static ā€œT.randomā€ API. I’ve yet to
   see a convincing example where it’d be useful to pick a value from the
   range of all possible values. The only truly useful one I’ve seen is ā€œpick
   a random boolā€, which could easily be done via ā€œ[true, false].random()"

   - I much prefer the GameplayKit API[0], which breaks the idea of
   randomness up in to 2 separate concepts:
      - A ā€œSourceā€ → Where the random numbers come from
      - A ā€œDistributionā€ → Initialized with a source, it makes sure the
      produced numbers exhibit a specific distribution over multiple samplings.
      Ie, a uniform distribution vs a Gaussian distribution, or something like ā€œI
      want to pick a card from a deck but bias choices towards Spades or Acesā€.
      I’m also reminded of the anecdote of how iTunes had to modify their
      ā€œplaylist shuffleā€ algorithm to be less random[1], because the true
      randomness would do weird things that made it seem not random. Spotify had
      the same problem and solution[2].
      - Breaking things up like this would also make it easier to test
      randomness (by using a replay-able source) but that still follow the
      parameters of your app (that it has a bell-curve distribution of
      probabilities, for example)

      - I’d still really really really like to see how this could be done
   as two separate things:
      - A minimal implementation in the Standard Library (like, defining
      the base Source and Distribution protocols, with a single default
      implementation of each)
      - A proposal for a more complete ā€œnon-standard libraryā€ where the
      larger array of functionality would be contained. For example, IMO I don’t
      think the shuffling stuff needs to be in the standard library. This is also
      where all the cryptographically secure stuff (that your typical app
      developer does not need) would live.

      - The ā€œrandomā€ element of a collection/range should be ā€œfunc
   random() → Element?ā€, not ā€œvar random: Element?ā€. Property values shouldn't
   change between accesses. Ditto the static ā€œRandomizable.randomā€ property.

   - What do you think about actively discouraging people from using the
   modulo operator to create a range? It could be done by having the RNGs
   return a ā€œRandomValue<T>ā€ type, then defining a mod operator that takes a
   RandomValue<T> and a T (?), and then giving it a deprecation warning +
   fixit. Not sure if that’d be worth the type overhead, but I’m very much in
   favor of encouraging people towards better practices.

   - I’m +1 on crashing if we can’t produce a random number.

   - What do you think about the philosophical difference of
   Type.random(using:) vs Type.init(randomSource:)?

Dave

[0]: GKRandom | Apple Developer Documentation
[1]: https://www.youtube.com/watch?v=lg188Ebas9E&feature=youtu.be&t=719
[2]: https://labs.spotify.com/2014/02/28/how-to-shuffle-songs/

I think these are some excellent points. Earlier, I think, others also
emphasized this idea of exploring what a really minimal implementation in
the standard library would look like, and I've been thinking about this
overnight.

There is much that is commendable about Alejandro's proposal, but I agree
that there is more than needs to be in the standard library. Here's what I
think the shape of a minimal API would look like, which would
simultaneously enable others to implement their desired functionality as an
end user:

- We need very performant, but otherwise barebones, access to system
randomness so that it can be a building block for everything else. Because
this is so special in that it cannot be seeded or initialized, unlike other
RNGs, we don't need this to be a type, and it doesn't need to conform to a
`RandomNumberGenerator` protocol. It can be as straightforward as one or
both of:

-- A global `func random() -> UInt32`, which is essentially `arc4random` on
macOS/iOS and reads from an appropriate secure source on Linux and other
platforms. One pro of having such a method is that it's a drop-in
replacement for `arc4random()` that's _very_ convenient as a primitive to
build up other random operations; one con is that it encourages modulo
bias, although fortunately mostly only with UInt32.
-- An extension method on `UnsafeMutableRawBufferPointer` named `func
copyRandomBytes()`. This would look a lot like Apple's `SecCopyRandomBytes`
and BSD's `arc4random_buf`.

- Having established the primitive, then we can ask what is the minimum
_useful_ functionality for an end user. I think the answer is a very
judicious subset of the currently proposed functionality:

-- An extension method or property (`random()` or `random`) on
`RandomAccessCollection`.
-- An extension method or property (`random()` or `random`) on `Range where
Bound : SignedInteger`, `Range where Bound : UnsignedInteger`, `Range where
Bound : BinaryFloatingPoint`.

- One advantage of abandoning `Randomizable` and spellings like
`Int.random` and `Double.random` is limiting possible modulo bias in using
the former, and also eliminating the lack of semantic guarantees about the
implicit bounds (which for Int are Int.min...Int.max, but for Double must
be 0..<1). There can be no confusion about `(0.0..<1.0).random`.
- An advantage inherent to abandoning the `randomIn:` or `random(in:)`
spelling is that users must now explicitly address what to do about an
empty range, which in my view is better than an implicit trap that may or
may not happen deterministically during testing.
- A final advantage over the current proposal is that we collapse three
very similar spellings with subtle usage differences but entirely
overlapping functionality--`Int.random`, `Int.random(in:)`, and
`Range.random`--into one with well-defined semantics. A user learns once
and can have good understanding of others' code and write their own without
hidden pitfalls.

The end result of such a minimalist design is that we introduce no new
types, no new protocols, and only three or four new functions. They all
have very defined semantics, and together they enable both basic use and
provide the building blocks for advanced users to create their own PRNGs
that fit their advanced cryptographic or non-cryptographic needs.

Thoughts?

Ā·Ā·Ā·

On Tue, Jan 2, 2018 at 10:19 AM, Dave DeLong via swift-evolution < swift-evolution@swift.org> wrote:

On Jan 2, 2018, at 1:35 AM, Alejandro Alonso via swift-evolution < > swift-evolution@swift.org> wrote:

Hello swift evolution once again, I’ve been hard at work considering every
email and revising the proposal. I’ve made lots of changes and additions to
the proposal to discuss some problems we’ve had (T.random), and walks
through detailed design. You can see the proposal here:
[Proposal] Random Unification by Azoy Ā· Pull Request #760 Ā· apple/swift-evolution Ā· GitHub .

A big issue that lots of people pointed out was `T.random %` and to remove
it completely from the API. To give a gist of why I continue to support
T.random:

1. Modulo bias misuse is only a problem to types that conform to
`BinaryInteger`. Why remove this functionality if only a portion of the
types have the ability of misuse. `Double.random % 10` is a good example of
where modulo isn’t implemented here as it produces the error, ā€œ'%' is
unavailable: Use truncatingRemainder insteadā€.

2. `Int.random(in: Int.min … Int.max)` doesn’t work. For developers that
actually rely on this functionality, the work around that was discussed
earlier simply doesn’t work. `Int.min … Int.max`’s count property exceeds
that of `Int`’s numerical range. A working work around would be something
along the lines of `Int(truncatingIfNeeded: Random.default.next(UInt.self))`
which creates a pain point for those developers. As the goal of this
proposal to remove pain points regarding random, this change does the
opposite.

I’m interested to hear if anymore discussion around this, or any other
issues come up.

- Alejandro

On Sep 8, 2017, 11:52 AM -0500, Alejandro Alonso via swift-evolution < > swift-evolution@swift.org>, wrote:

Hello swift evolution, I would like to propose a unified approach to
`random()` in Swift. I have a simple implementation here
https://gist.github.com/Azoy/5d294148c8b97d20b96ee64f434bb4f5\. This
implementation is a simple wrapper over existing random functions so
existing code bases will not be affected. Also, this approach introduces a
new random feature for Linux users that give them access to upper bounds,
as well as a lower bound for both Glibc and Darwin users. This change would
be implemented within Foundation.

I believe this simple change could have a very positive impact on new
developers learning Swift and experienced developers being able to write
single random declarations.

I’d like to hear about your ideas on this proposal, or any implementation
changes if need be.

- Alejando

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

I'm not sure how much background you have into this thread, but the idea of sources and distributions was rejected months ago as almost always too cumbersome given that people overwhelmingly want uniform random numbers.

I agree that random() is better as a method. I also think that the default Random implementation should be in a class, not a struct. If a generator has value semantics, I would expect that two copies would return an identical sequence of numbers.

I think that it'll be hard to make a RandomValue<T> that nicely converts to T. The best way to discourage modulo is probably to make T.random/T(randomSource:) as cumbersome as possible, and Range.random as nice as possible.

Ā·Ā·Ā·

Le 2 janv. 2018 Ơ 11:19, Dave DeLong via swift-evolution <swift-evolution@swift.org> a Ʃcrit :

Just skimmed through the updated proposal and am weighing in with my naĆÆve opinions:

I’m still highly skeptical of a static ā€œT.randomā€ API. I’ve yet to see a convincing example where it’d be useful to pick a value from the range of all possible values. The only truly useful one I’ve seen is ā€œpick a random boolā€, which could easily be done via ā€œ[true, false].random()"

I much prefer the GameplayKit API[0], which breaks the idea of randomness up in to 2 separate concepts:
A ā€œSourceā€ → Where the random numbers come from
A ā€œDistributionā€ → Initialized with a source, it makes sure the produced numbers exhibit a specific distribution over multiple samplings. Ie, a uniform distribution vs a Gaussian distribution, or something like ā€œI want to pick a card from a deck but bias choices towards Spades or Acesā€. I’m also reminded of the anecdote of how iTunes had to modify their ā€œplaylist shuffleā€ algorithm to be less random[1], because the true randomness would do weird things that made it seem not random. Spotify had the same problem and solution[2].
Breaking things up like this would also make it easier to test randomness (by using a replay-able source) but that still follow the parameters of your app (that it has a bell-curve distribution of probabilities, for example)

I’d still really really really like to see how this could be done as two separate things:
A minimal implementation in the Standard Library (like, defining the base Source and Distribution protocols, with a single default implementation of each)
A proposal for a more complete ā€œnon-standard libraryā€ where the larger array of functionality would be contained. For example, IMO I don’t think the shuffling stuff needs to be in the standard library. This is also where all the cryptographically secure stuff (that your typical app developer does not need) would live.

The ā€œrandomā€ element of a collection/range should be ā€œfunc random() → Element?ā€, not ā€œvar random: Element?ā€. Property values shouldn't change between accesses. Ditto the static ā€œRandomizable.randomā€ property.

What do you think about actively discouraging people from using the modulo operator to create a range? It could be done by having the RNGs return a ā€œRandomValue<T>ā€ type, then defining a mod operator that takes a RandomValue<T> and a T (?), and then giving it a deprecation warning + fixit. Not sure if that’d be worth the type overhead, but I’m very much in favor of encouraging people towards better practices.

I’m +1 on crashing if we can’t produce a random number.

What do you think about the philosophical difference of Type.random(using:) vs Type.init(randomSource:)?

Dave

[0]: GKRandom | Apple Developer Documentation
[1]: https://www.youtube.com/watch?v=lg188Ebas9E&feature=youtu.be&t=719 <https://www.youtube.com/watch?v=lg188Ebas9E&feature=youtu.be&t=719&gt;
[2]: https://labs.spotify.com/2014/02/28/how-to-shuffle-songs/

On Jan 2, 2018, at 1:35 AM, Alejandro Alonso via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

Hello swift evolution once again, I’ve been hard at work considering every email and revising the proposal. I’ve made lots of changes and additions to the proposal to discuss some problems we’ve had (T.random), and walks through detailed design. You can see the proposal here: [Proposal] Random Unification by Azoy Ā· Pull Request #760 Ā· apple/swift-evolution Ā· GitHub .

A big issue that lots of people pointed out was `T.random %` and to remove it completely from the API. To give a gist of why I continue to support T.random:

1. Modulo bias misuse is only a problem to types that conform to `BinaryInteger`. Why remove this functionality if only a portion of the types have the ability of misuse. `Double.random % 10` is a good example of where modulo isn’t implemented here as it produces the error, ā€œ'%' is unavailable: Use truncatingRemainder insteadā€.

2. `Int.random(in: Int.min … Int.max)` doesn’t work. For developers that actually rely on this functionality, the work around that was discussed earlier simply doesn’t work. `Int.min … Int.max`’s count property exceeds that of `Int`’s numerical range. A working work around would be something along the lines of `Int(truncatingIfNeeded: Random.default.next(UInt.self))` which creates a pain point for those developers. As the goal of this proposal to remove pain points regarding random, this change does the opposite.

I’m interested to hear if anymore discussion around this, or any other issues come up.

- Alejandro

On Sep 8, 2017, 11:52 AM -0500, Alejandro Alonso via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>>, wrote:

Hello swift evolution, I would like to propose a unified approach to `random()` in Swift. I have a simple implementation here https://gist.github.com/Azoy/5d294148c8b97d20b96ee64f434bb4f5\. This implementation is a simple wrapper over existing random functions so existing code bases will not be affected. Also, this approach introduces a new random feature for Linux users that give them access to upper bounds, as well as a lower bound for both Glibc and Darwin users. This change would be implemented within Foundation.

I believe this simple change could have a very positive impact on new developers learning Swift and experienced developers being able to write single random declarations.

I’d like to hear about your ideas on this proposal, or any implementation changes if need be.

- Alejando

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Software engineers are *so* bad at this.

Ā·Ā·Ā·

On Sep 9, 2017, at 12:03 PM, Taylor Swift via swift-evolution <swift-evolution@swift.org> wrote:

I would argue that anyone doing cryptography probably already knows how important RNG selection is and can be expected to look for a specialized cryptographically secure RNG. I doubt they would just use the default RNG without first checking the documentation.

--
Brent Royal-Gordon
Architechies

If it where the case, why is there so many security issues due to poor choice of random source ?

Ā·Ā·Ā·

Le 9 sept. 2017 Ơ 21:03, Taylor Swift via swift-evolution <swift-evolution@swift.org> a Ʃcrit :

On Fri, Sep 8, 2017 at 8:07 PM, Xiaodi Wu via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
On Fri, Sep 8, 2017 at 7:50 PM, Stephen Canon <scanon@apple.com <mailto:scanon@apple.com>> wrote:

On Sep 8, 2017, at 8:09 PM, Xiaodi Wu via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

This topic has been broached on Swift Evolution previously. It's interesting to me that Steve Canon is so certain that CSPRNGs are the way to go. I wasn't aware that hardware CSPRNGs have come such a long way and are so ubiquitous as to be feasible as a basis for Swift random numbers. If so, great.

Otherwise, if there is any way that a software, non-cryptographically secure PRNG is going to outperform a CSPRNG, then I think it's worthwhile to have a (carefully documented) choice between the two. I would imagine that for many uses, such as an animation in which you need a plausible source of noise to render a flame, whether that is cryptographically secure or not is absolutely irrelevant but performance may be key.

Let me be precise: it is absolutely possible to outperform CSPRNGs. They have simply become fast enough that the performance gap doesn’t matter for most uses (let’s say amortized ten cycles per byte or less—whatever you are going to do with the random bitstream will be much more expensive than getting the bits was).

That said, yes, there should definitely be other options. It should be possible for users to get reproducible results from a stdlib random interface run-to-run, and also across platforms. That alone requires that at least one other option for a generator be present. There may also be a place for a very high-throughput generator like xorshiro128+.

All I’m really saying is that the *default* generator should be an os-provided unseeded CSPRNG, and we should be very careful about documenting any generator options.

Agree on all points. Much like Swift's strings are Unicode-correct instead of the fastest possible way of slicing and dicing sequences of ASCII characters, Swift's default PRNG should be cryptographically secure.

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

I would argue that anyone doing cryptography probably already knows how important RNG selection is

* The distinction to be made here is CSPRNGs versus non-cryptographically secure PRNGs, where CSPRNG : PRNG. ā€œReproducibleā€ is not the right word. Based on my understanding, some CSPRNGs can be ā€œreproducibleā€ if the seed is known; what makes it cryptographically secure is that observing its previous *outputs* does not provide information useful to predict future outputs. Along those lines, it may be important to securely delete the seed from memory as soon as possible; there is some way of doing so in C (it’s used in the ChaCha20 reference implementation) but I don’t believe any way of doing so in Swift.

It's possible to use a CSPRNG-grade algorithm and seed it once to get a reproducible sequence, but when you use it as a CSPRNG, you typically feed entropy back into it at nondeterministic points to ensure that even if you started with a bad seed, you'll eventually get to an alright state. Unless you keep track of when entropy was mixed in and what the values were, you'll never get a reproducible CSPRNG.

We would give developers a false sense of security if we provided them with CSPRNG-grade algorithms that we called CSPRNGs and that they could seed themselves. Just because it says "crypto-secure" in the name doesn't mean that it'll be crypto-secure if it's seeded with time(). Therefore, "reproducible" vs "non-reproducible" looks like a good distinction to me.

* On the issue of consuming entropy: a glaring underlying inconvenience in the API needs to be reckoned with. Sometimes, there simply isn’t enough entropy to generate another random number. If cryptographic security were not default, then it might be OK to fall back to some other method that produces a low-quality result. However, if we are to do the secure thing, we must decide whether the lack of entropy results in a call to a random method to (a) return nil; (b) throw; (c) fatalError; or (d) block. There is no way to paper over this problem; not enough entropy means you can’t get a random number when you want it. The debate over blocking versus non-blocking error, for example, held up the addition of getrandom() to Glibc for some time. In my proposed design, initializing a PRNG from the system’s secure stream of random bytes is failable; therefore, a user can choose how to handle the lack of entropy. However, it is desirable to have a thread-local CSPRNG that is used for calls, say, to Int.random(). It would be unfortunate if Int.random() itself was failable; however, that leads to an uncomfortable question: if there is insufficient entropy, should Int.random() block or fatalError? That seems pretty terrible too. However, one cannot simply write this off as an edge case: if this is to be a robust part of the standard library, it must do the ā€œrightā€ thing. Particularly if Swift is to be a true systems programming language and it must accommodate the case when a system is first booted and there is very little entropy.

That's not really the case anymore these days. You're probably thinking of /dev/urandom vs. /dev/random. On Darwin, they're the same thing and never run out (see man urandom). On Linux, the state of the art is that you leave /dev/random alone. Don't take it from me, Prof. Daniel J. Bernstein <https://en.wikipedia.org/wiki/Daniel_J._Bernstein&gt; wrote this <https://www.mail-archive.com/cryptography@randombit.net/msg04763.html&gt; a while ago:

Think about this for a moment: whoever wrote the /dev/random manual page seems to simultaneously believe that

   (1) we can't figure out how to deterministically expand one 256-bit /dev/random output into an endless stream of unpredictable keys (this is what we need from urandom), but

   (2) we _can_ figure out how to use a single key to safely encrypt many messages (this is what we need from SSL, PGP, etc.).

For a cryptographer this doesn't even pass the laugh test.

So that shouldn't be too concerning.

* What should the default CSPRNG be? There are good arguments for using a cryptographically secure device random. (In my proposed implementation, for device random, I use Security.framework on Apple platforms (because /dev/urandom is not guaranteed to be available due to the sandbox, IIUC). On Linux platforms, I would prefer to use getrandom() and avoid using file system APIs, but getrandom() is new and unsupported on some versions of Ubuntu that Swift supports. This is an issue in and of itself.) Now, a number of these facilities strictly limit or do not guarantee availability of more than a small number of random bytes at a time; they are recommended for seeding other PRNGs but *not* as a routine source of random numbers. Therefore, although device random should be available to users, it probably shouldn’t be the default for the Swift standard library as it could have negative consequences for the system as a whole. There follows the significant task of implementing a CSPRNG correctly and securely for the default PRNG.

Theo give a talk a few years ago <https://www.youtube.com/watch?v=aWmLWx8ut20&gt; on randomness and how these problems are approached in LibreSSL.

FƩlix

Ā·Ā·Ā·

Le 26 sept. 2017 Ơ 07:31, Xiaodi Wu via swift-evolution <swift-evolution@swift.org> a Ʃcrit :

* The distinction to be made here is CSPRNGs versus non-cryptographically
secure PRNGs, where CSPRNG : PRNG. ā€œReproducibleā€ is not the right word.
Based on my understanding, some CSPRNGs can be ā€œreproducibleā€ if the seed
is known; what makes it cryptographically secure is that observing its
previous *outputs* does not provide information useful to predict future
outputs. Along those lines, it may be important to securely delete the seed
from memory as soon as possible; there is some way of doing so in C (it’s
used in the ChaCha20 reference implementation) but I don’t believe any way
of doing so in Swift.

It's possible to use a CSPRNG-grade algorithm and seed it once to get a
reproducible sequence, but when you use it as a CSPRNG, you typically feed
entropy back into it at nondeterministic points to ensure that even if you
started with a bad seed, you'll eventually get to an alright state. Unless
you keep track of when entropy was mixed in and what the values were,
you'll never get a reproducible CSPRNG.

We would give developers a false sense of security if we provided them
with CSPRNG-grade algorithms that we called CSPRNGs and that they could
seed themselves. Just because it says "crypto-secure" in the name doesn't
mean that it'll be crypto-secure if it's seeded with time(). Therefore,
"reproducible" vs "non-reproducible" looks like a good distinction to me.

I disagree here, in two respects:

First, whether or not a particular PRNG is cryptographically secure is an
intrinsic property of the algorithm; whether it's "reproducible" or not is
determined by the published API. In other words, the distinction between
CSPRNG vs. non-CSPRNG is important to document because it's semantics that
cannot be deduced by the user otherwise, and it is an important one for
writing secure code because it tells you whether an attacker can predict
future outputs based only on observing past outputs. "Reproducible" in the
sense of seedable or not is trivially noted by inspection of the published
API, and it is rather immaterial to writing secure code. If your attacker
can observe your seeding once, chances are that they can observe your
reseeding too; then, they can use their own implementation of the PRNG
(whether CSPRNG or non-CSPRNG) and reproduce your pseudorandom sequence
whether or not Swift exposes any particular API.

Secondly, I see no reason to justify the notion that, simply because a PRNG
is cryptographically secure, we ought to hide the seeding initializer
(because one has to exist internally anyway) from the public. Obviously,
one use case for a deterministic PRNG is to get reproducible sequences of
random-appearing values; this can be useful whether the underlying
algorithm is cryptographically secure or not. There are innumerably many
ways to use data generated from a CSPRNG in non-cryptographically secure
ways and omitting or including a public seeding initializer does not change
that; in other words, using a deterministic seed for a CSPRNG would be a
bad idea in certain applications, but it's a deliberate act, and someone
who would mistakenly do that is clearly incapable of *using* the output
from the PRNG in a secure way either; put a third way, you would be hard
pressed to find a situation where it's true that "if only Swift had not
made the seeding initializer public, this author would have written secure
code, but instead the only security hole that existed in the code was
caused by the availability of a public seeding initializer mistakenly
used." The point of having both explicitly instantiable PRNGs and a layer
of simpler APIs like "Int.random()" is so that the less experienced user
can get the "right thing" by default, and the experienced user can
customize the behavior; any user that instantiates his or her own
ChaCha20Random instance is already calling for the power user interface; it
is reasonable to expose the underlying primitive operations (such as
seeding) so long as there are legitimate uses for it.

* On the issue of consuming entropy: a glaring underlying inconvenience in
the API needs to be reckoned with. Sometimes, there simply isn’t enough
entropy to generate another random number. If cryptographic security were
not default, then it might be OK to fall back to some other method that
produces a low-quality result. However, if we are to do the secure thing,
we must decide whether the lack of entropy results in a call to a random
method to (a) return nil; (b) throw; (c) fatalError; or (d) block. There is
no way to paper over this problem; not enough entropy means you can’t get a
random number when you want it. The debate over blocking versus
non-blocking error, for example, held up the addition of getrandom() to
Glibc for some time. In my proposed design, initializing a PRNG from the
system’s secure stream of random bytes is failable; therefore, a user can
choose how to handle the lack of entropy. However, it is desirable to have
a thread-local CSPRNG that is used for calls, say, to Int.random(). It
would be unfortunate if Int.random() itself was failable; however, that
leads to an uncomfortable question: if there is insufficient entropy,
should Int.random() block or fatalError? That seems pretty terrible too.
However, one cannot simply write this off as an edge case: if this is to be
a robust part of the standard library, it must do the ā€œrightā€ thing.
Particularly if Swift is to be a true systems programming language and it
must accommodate the case when a system is first booted and there is very
little entropy.

That's not really the case anymore these days. You're probably thinking of
/dev/urandom vs. /dev/random.

I'm also talking about getrandom() on Linux, getentropy() on BSD and Linux,
and SecCopyRandomBytes() on Apple platforms.

On Darwin, they're the same thing and never run out (see man urandom). On
Linux, the state of the art is that you leave /dev/random alone. Don't take
it from me, Prof. Daniel J. Bernstein
<https://en.wikipedia.org/wiki/Daniel_J._Bernstein&gt; wrote this
<https://www.mail-archive.com/cryptography@randombit.net/msg04763.html&gt; a
while ago:

Think about this for a moment: whoever wrote the /dev/random manual page
seems to simultaneously believe that

   (1) we can't figure out how to deterministically expand one 256-bit
/dev/random output into an endless stream of unpredictable keys (this is
what we need from urandom), but

   (2) we _can_ figure out how to use a single key to safely encrypt many
messages (this is what we need from SSL, PGP, etc.).

For a cryptographer this doesn't even pass the laugh test.

So that shouldn't be too concerning.

I'm fully aware of the myths surrounding /dev/urandom and /dev/random.
/dev/urandom might never run out, but it is also possible for it not to be
initialized at all, as in the case of some VM setups. In some older
versions of iOS, /dev/[u]random is reportedly sandboxed out. On systems
where it is available, it can also be deleted, since it is a file. The
point is, all of these scenarios cause an error during seeding of a CSPRNG.
The question is, how to proceed in the face of inability to access entropy.
We must do something, because we cannot therefore return a
cryptographically secure answer. Rare trapping on invocation of
Int.random() or permanently waiting for a never-to-be-initialized
/dev/urandom would be terrible to debug, but returning an optional or
throwing all the time would be verbose. How to design this API?

* What should the default CSPRNG be? There are good arguments for using a

cryptographically secure device random. (In my proposed implementation, for
device random, I use Security.framework on Apple platforms (because
/dev/urandom is not guaranteed to be available due to the sandbox, IIUC).
On Linux platforms, I would prefer to use getrandom() and avoid using file
system APIs, but getrandom() is new and unsupported on some versions of
Ubuntu that Swift supports. This is an issue in and of itself.) Now, a
number of these facilities strictly limit or do not guarantee availability
of more than a small number of random bytes at a time; they are recommended
for seeding other PRNGs but *not* as a routine source of random numbers.
Therefore, although device random should be available to users, it probably
shouldn’t be the default for the Swift standard library as it could have
negative consequences for the system as a whole. There follows the
significant task of implementing a CSPRNG correctly and securely for the
default PRNG.

Theo give a talk a few years ago
<https://www.youtube.com/watch?v=aWmLWx8ut20&gt; on randomness and how these
problems are approached in LibreSSL.

Certainly, we can learn a lot from those like Theo who've dealt with the
issue. I'm not in a position to watch the talk at the moment; can you
summarize what the tl;dr version of it is?

Ā·Ā·Ā·

On Tue, Sep 26, 2017 at 11:26 AM, FƩlix Cloutier <felixcloutier@icloud.com> wrote:

Le 26 sept. 2017 Ơ 07:31, Xiaodi Wu via swift-evolution < > swift-evolution@swift.org> a Ʃcrit :

Felix and Jonathan make some good points. Some general comments:

* I think, in general, this area needs a detailed review by those who are expert in the domain; especially if we are to assert that the design is cryptographically secure, we need to ensure that it is actually so. In other words, not just the algorithms that we intend to implement, but the implementations themselves. This is not at all trivial. We will also need to specify whether extension methods that generate certain distributions, etc., are guaranteed secure against side channel attacks where such implementations are known possible but more expensive.

* Without attempting to bikeshed, the use of the word ā€œSourceā€ is potentially confusing; it could suggest that the conforming type is a source of entropy when in fact it consumes entropy. In my proposed design, the protocol is simply named ā€œPRNGā€ with an emphasis on the P (i.e. pseudorandom, not random).

This is a good point. My use of the word ā€œsource" in the code I shared (much) earlier is because I have a ā€œSourceā€ protocol which essentially represents an (effectively) infinite iterator/sequence. RandomSource is a specific sub-protocol I have which provides an infinite iterator/sequence for repeatably random numbers. There are other sources which take an array and cycle it’s indices, for example. I also have a constant source, which always returns the same thing. I use all of this to do cool generative graphical effects.

We would give developers a false sense of security if we provided them with CSPRNG-grade algorithms that we called CSPRNGs and that they could seed themselves. Just because it says "crypto-secure" in the name doesn't mean that it'll be crypto-secure if it's seeded with time(). Therefore, "reproducible" vs "non-reproducible" looks like a good distinction to me.

I disagree here, in two respects:

First, whether or not a particular PRNG is cryptographically secure is an intrinsic property of the algorithm; whether it's "reproducible" or not is determined by the published API. In other words, the distinction between CSPRNG vs. non-CSPRNG is important to document because it's semantics that cannot be deduced by the user otherwise, and it is an important one for writing secure code because it tells you whether an attacker can predict future outputs based only on observing past outputs. "Reproducible" in the sense of seedable or not is trivially noted by inspection of the published API, and it is rather immaterial to writing secure code. If your attacker can observe your seeding once, chances are that they can observe your reseeding too; then, they can use their own implementation of the PRNG (whether CSPRNG or non-CSPRNG) and reproduce your pseudorandom sequence whether or not Swift exposes any particular API.

Secondly, I see no reason to justify the notion that, simply because a PRNG is cryptographically secure, we ought to hide the seeding initializer (because one has to exist internally anyway) from the public.

To me, ReproducibleRandomSource has a semantic meaning (more than being a bag of methods). It has an init(seed:) method because you HAVE to be able to seed it for reproducibility (not because sources which have a seed would all be reproducible). The fact that RandomSource does not have that requirement doesn’t mean you can’t have a source which is seeded… it just allows for sources which aren’t. If something calls itself Reproducible, it should actually be reproducible, which includes being able to restore previous states (not just the starting seed).

* The distinction to be made here is CSPRNGs versus non-cryptographically secure PRNGs, where CSPRNG : PRNG. ā€œReproducibleā€ is not the right word. Based on my understanding, some CSPRNGs can be ā€œreproducibleā€ if the seed is known; what makes it cryptographically secure is that observing its previous *outputs* does not provide information useful to predict future outputs. Along those lines, it may be important to securely delete the seed from memory as soon as possible; there is some way of doing so in C (it’s used in the ChaCha20 reference implementation) but I don’t believe any way of doing so in Swift.

Is CSPRNG vs PRNG really an important distinction for the API? It is an important distinction when choosing a source, of course, but should that be reflected in our protocols? We should definitely make sure that our API does not prevent secure/CSPRNGs from working (e.g. not requiring a seed in the base API).

Reproducible, on the other hand, is a completely different use-case, which requires a different API because it is used differently. We also need to make sure our protocol design does not preclude or make difficult these uses. (We just want to avoid people using them while thinking they are secure).

We should default to something reasonably secure, but also fast. People who are doing cryptography should be using a cryptography package… but that package should be able to plug seamlessly into whatever API we have created.

I am leaning towards a design with an abstract base protocol, and then sub-protocols for Reproducibility and Secureness.

* On the issue of consuming entropy: a glaring underlying inconvenience in the API needs to be reckoned with. Sometimes, there simply isn’t enough entropy to generate another random number. If cryptographic security were not default, then it might be OK to fall back to some other method that produces a low-quality result. However, if we are to do the secure thing, we must decide whether the lack of entropy results in a call to a random method to (a) return nil; (b) throw; (c) fatalError; or (d) block. There is no way to paper over this problem; not enough entropy means you can’t get a random number when you want it. The debate over blocking versus non-blocking error, for example, held up the addition of getrandom() to Glibc for some time. In my proposed design, initializing a PRNG from the system’s secure stream of random bytes is failable; therefore, a user can choose how to handle the lack of entropy. However, it is desirable to have a thread-local CSPRNG that is used for calls, say, to Int.random(). It would be unfortunate if Int.random() itself was failable; however, that leads to an uncomfortable question: if there is insufficient entropy, should Int.random() block or fatalError? That seems pretty terrible too. However, one cannot simply write this off as an edge case: if this is to be a robust part of the standard library, it must do the ā€œrightā€ thing. Particularly if Swift is to be a true systems programming language and it must accommodate the case when a system is first booted and there is very little entropy.

Agreed. I have an API (for non-cryptographic use) that has two methods of getting unique random objects. One is failable, and returns nil if the object can’t be shown to be unique. The other tries to get a unique object, but will give repeats if necessary… it always returns a thing.

We could do something similar with entropy. We could have either an optional or throwing function which returns our random number. A sub-protocol could define a version (with a different name) that is non-optional which will always return an answer, even at the risk of being less secure.

Thanks,
Jon

Ā·Ā·Ā·

On Sep 26, 2017, at 7:31 AM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

- I’d love to see several of the most common random kinds supported, and I agree it would be nice (but not required IMO) for the default to be cryptographically secure.

I would be very careful about choosing a "simple" solution. There is a log, sad history of languages trying to provide a "simple" random number generator and accidentally providing a powerful footgun instead. But:

- We should avoid the temptation to nuke this mosquito with a heavy handed solution designed to solve all of the world’s problems: For example, the C++ random number stuff is crazily over-general. The stdlib should aim to solve (e.g.) the top 3 most common cases, and let a more specialized external library solve the fully general problem (e.g. seed management, every distribution imaginable, etc).

That's not to say we need to have seven engines and twenty distributions like C++ does. The standard library is not a statistics package; it exists to provide basic abstractions and fundamental functionality. I don't think it should worry itself with distributions at all. I think it needs to provide:

  1. The abstraction used to plug in different random number generators (i.e. an RNG protocol of some kind).

  2. APIs on existing standard library types which perform basic randomness-related functions correctly—essentially, encapsulating Knuth. (Specifically, I think selecting a random element from a collection (which also covers generating a random integer in a range), shuffling a mutable collection, and generating a random float will do the trick.)

  3. A default RNG with a conservative design that will sometimes be too slow, but will never be insufficiently random.

If you want to pick elements with a Poisson distribution, go get a statistics framework; if you want repeatable random numbers for testing, use a seedable PRNG from XCTest or some other test tools package. These can leverage the standard library's RNG protocol to work with existing random number generators or random number consumers.

+1 to this general plan!

This pretty much exactly matches my preferences.

If random numbers go into the std lib, they should being able to customize the source of randomness for speed or test reproducibility, but default to something sensible without the user having to know it’s configurable. On Darwin that default should be based on arc4random(3). The std lib doesn’t need to provide other non-default random sources. Non-random sources for testing should be part of test frameworks and plug in easily.

The proposal should include shuffle and random element from collection, which are much-requested and not really the controversial part so won't hold up the overall progress of the proposal.

(and no need for distributions other than uniform IMO, :fr::fish: or otherwise)

Ā·Ā·Ā·

On Sep 30, 2017, at 3:23 PM, Chris Lattner via swift-evolution <swift-evolution@swift.org> wrote:

On Sep 11, 2017, at 9:43 PM, Brent Royal-Gordon <brent@architechies.com <mailto:brent@architechies.com>> wrote:

On Sep 9, 2017, at 10:31 PM, Chris Lattner via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

-Chris

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

My comments are directed to the "more up-to-date" document that you just
linked to in your reply to Jon. Is that one outdated? If so, can you send a
link to the updated proposal and implementation for which you're soliciting
feedback?

Ā·Ā·Ā·

On Sun, Nov 5, 2017 at 6:12 PM, Alejandro Alonso <aalonso128@outlook.com> wrote:

The proposal and implementation have the current updated API. The link I
sent Jon was the one I brought up a few weeks ago which is outdated now.
The proposal answers all of your questions. As for `.random` being a
function, some would argue that it behaves in the same way as `.first` and
`.last` which are properties.

- Alejandro

On Nov 5, 2017, 6:07 PM -0600, Xiaodi Wu <xiaodi.wu@gmail.com>, wrote:

A few quick thoughts:

I know that there's been some discussion that `(1...10).random` is the
best spelling, but I'd like to push back on that suggestion. When I want a
random number, I tend to think of the type I want first ("I want a random
integer") and then a range ("I want a random integer between a and b"), not
the other way around. My intuition is that `Int.random(in:)` will be more
discoverable, both on that basis and because it is more similar to other
languages' syntax (`Math.random` in JavaScript and `randint` in NumPy, for
example). It also has the advantage that the type is explicit, which I
think is particularly useful in this case because the value itself is,
well, random.

I would also argue that, `random` is most appropriately a method and not a
property; there's no hard and fast rule for this, but the fact that the
result is stochastic suggests (to me) that it's not a "property" of the
range (or, for that matter, of the type).

I would reiterate here my qualms about `Source` being the term used for a
generator. These types are not a _source_ of entropy but rather a
_consumer_ of entropy.

`UnsafeRandomSource` needs to be renamed; "unsafe" has a specific meaning
in Swift--that is, memory safety, and this is not it. Moreover, it's
questionable whether this protocol is useful in any sense. What useful
generic algorithms can one write with such a protocol?

`XoroshiroRandom` cannot be seeded by any `Numeric` value; depending on
the specific algorithm it needs a seed of a specific bit width. If you
default the shared instance to being seeded with an `Int` then you will
have to have distinct implementations for 32-bit and 64-bit platforms. This
is unadvisable. On that note, your `UnsafeRandomSource` needs to have an
associated type and not a generic `<T : Numeric>` for the seed.

The default random number generator should be cryptographically secure;
however, it's not clear to me that it should be device random.

I agree with others that alternative random number generators other than
the default RNG (and, if not default, possibly also the device RNG) should
be accommodated by the protocol hierarchy but not necessarily supplied in
the stdlib.

The term `Randomizable` means something specific which is not how it's
used in your proposed protocol.

There's still the open question, not answered, about how requesting an
instance of the hardware RNG behaves when there's insufficient or no
entropy. Does it return nil, throw, trap, or wait? The proposed API does
not clarify this point, although based on the method signature it cannot
return nil or throw. Trapping might be acceptable but I'd be interested to
hear your take as to why it is preferable.

On Sun, Nov 5, 2017 at 4:43 PM, Alejandro Alonso via swift-evolution < > swift-evolution@swift.org> wrote:

For the proof of concept, I had accidentally deleted that one. I have a
more up to date one which was discussed a few weeks later.
Swift Random Unification Design Ā· GitHub

- Alejandro

On Nov 5, 2017, 4:37 PM -0600, Jonathan Hull <jhull@gbis.com>, wrote:

Is there a link to the writeup? The one in the quote 404s.

Thanks,
Jon

On Nov 5, 2017, at 2:10 PM, Alejandro Alonso via swift-evolution < >> swift-evolution@swift.org> wrote:

Hello once again Swift evolution community. I have taken the time to
write up the proposal for this thread, and have provided an implementation
for it as well. I hope to once again get good feedback on the overall
proposal.

- Alejandro

On Sep 8, 2017, 11:52 AM -0500, Alejandro Alonso via swift-evolution < >> swift-evolution@swift.org>, wrote:

Hello swift evolution, I would like to propose a unified approach to
`random()` in Swift. I have a simple implementation here
https://gist.github.com/Azoy/5d294148c8b97d20b96ee64f434bb4f5\. This
implementation is a simple wrapper over existing random functions so
existing code bases will not be affected. Also, this approach introduces a
new random feature for Linux users that give them access to upper bounds,
as well as a lower bound for both Glibc and Darwin users. This change would
be implemented within Foundation.

I believe this simple change could have a very positive impact on new
developers learning Swift and experienced developers being able to write
single random declarations.

I’d like to hear about your ideas on this proposal, or any implementation
changes if need be.

- Alejando

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

[Proposal] Random Unification by Azoy Ā· Pull Request #760 Ā· apple/swift-evolution Ā· GitHub is the current API and proposed solution.

- Alejandro

Ā·Ā·Ā·

On Nov 5, 2017, 6:18 PM -0600, Xiaodi Wu <xiaodi.wu@gmail.com>, wrote:
My comments are directed to the "more up-to-date" document that you just linked to in your reply to Jon. Is that one outdated? If so, can you send a link to the updated proposal and implementation for which you're soliciting feedback?

On Sun, Nov 5, 2017 at 6:12 PM, Alejandro Alonso <aalonso128@outlook.com<mailto:aalonso128@outlook.com>> wrote:
The proposal and implementation have the current updated API. The link I sent Jon was the one I brought up a few weeks ago which is outdated now. The proposal answers all of your questions. As for `.random` being a function, some would argue that it behaves in the same way as `.first` and `.last` which are properties.

- Alejandro

On Nov 5, 2017, 6:07 PM -0600, Xiaodi Wu <xiaodi.wu@gmail.com<mailto:xiaodi.wu@gmail.com>>, wrote:
A few quick thoughts:

I know that there's been some discussion that `(1...10).random` is the best spelling, but I'd like to push back on that suggestion. When I want a random number, I tend to think of the type I want first ("I want a random integer") and then a range ("I want a random integer between a and b"), not the other way around. My intuition is that `Int.random(in:)` will be more discoverable, both on that basis and because it is more similar to other languages' syntax (`Math.random` in JavaScript and `randint` in NumPy, for example). It also has the advantage that the type is explicit, which I think is particularly useful in this case because the value itself is, well, random.

I would also argue that, `random` is most appropriately a method and not a property; there's no hard and fast rule for this, but the fact that the result is stochastic suggests (to me) that it's not a "property" of the range (or, for that matter, of the type).

I would reiterate here my qualms about `Source` being the term used for a generator. These types are not a _source_ of entropy but rather a _consumer_ of entropy.

`UnsafeRandomSource` needs to be renamed; "unsafe" has a specific meaning in Swift--that is, memory safety, and this is not it. Moreover, it's questionable whether this protocol is useful in any sense. What useful generic algorithms can one write with such a protocol?

`XoroshiroRandom` cannot be seeded by any `Numeric` value; depending on the specific algorithm it needs a seed of a specific bit width. If you default the shared instance to being seeded with an `Int` then you will have to have distinct implementations for 32-bit and 64-bit platforms. This is unadvisable. On that note, your `UnsafeRandomSource` needs to have an associated type and not a generic `<T : Numeric>` for the seed.

The default random number generator should be cryptographically secure; however, it's not clear to me that it should be device random.

I agree with others that alternative random number generators other than the default RNG (and, if not default, possibly also the device RNG) should be accommodated by the protocol hierarchy but not necessarily supplied in the stdlib.

The term `Randomizable` means something specific which is not how it's used in your proposed protocol.

There's still the open question, not answered, about how requesting an instance of the hardware RNG behaves when there's insufficient or no entropy. Does it return nil, throw, trap, or wait? The proposed API does not clarify this point, although based on the method signature it cannot return nil or throw. Trapping might be acceptable but I'd be interested to hear your take as to why it is preferable.

On Sun, Nov 5, 2017 at 4:43 PM, Alejandro Alonso via swift-evolution <swift-evolution@swift.org<mailto:swift-evolution@swift.org>> wrote:
For the proof of concept, I had accidentally deleted that one. I have a more up to date one which was discussed a few weeks later. Swift Random Unification Design Ā· GitHub

- Alejandro

On Nov 5, 2017, 4:37 PM -0600, Jonathan Hull <jhull@gbis.com<mailto:jhull@gbis.com>>, wrote:
Is there a link to the writeup? The one in the quote 404s.

Thanks,
Jon

On Nov 5, 2017, at 2:10 PM, Alejandro Alonso via swift-evolution <swift-evolution@swift.org<mailto:swift-evolution@swift.org>> wrote:

Hello once again Swift evolution community. I have taken the time to write up the proposal for this thread, and have provided an implementation for it as well. I hope to once again get good feedback on the overall proposal.

- Alejandro

On Sep 8, 2017, 11:52 AM -0500, Alejandro Alonso via swift-evolution <swift-evolution@swift.org<mailto:swift-evolution@swift.org>>, wrote:
Hello swift evolution, I would like to propose a unified approach to `random()` in Swift. I have a simple implementation here https://gist.github.com/Azoy/5d294148c8b97d20b96ee64f434bb4f5\. This implementation is a simple wrapper over existing random functions so existing code bases will not be affected. Also, this approach introduces a new random feature for Linux users that give them access to upper bounds, as well as a lower bound for both Glibc and Darwin users. This change would be implemented within Foundation.

I believe this simple change could have a very positive impact on new developers learning Swift and experienced developers being able to write single random declarations.

I’d like to hear about your ideas on this proposal, or any implementation changes if need be.

- Alejando

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org<mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org<mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

It's possible to use a CSPRNG-grade algorithm and seed it once to get a reproducible sequence, but when you use it as a CSPRNG, you typically feed entropy back into it at nondeterministic points to ensure that even if you started with a bad seed, you'll eventually get to an alright state. Unless you keep track of when entropy was mixed in and what the values were, you'll never get a reproducible CSPRNG.

We would give developers a false sense of security if we provided them with CSPRNG-grade algorithms that we called CSPRNGs and that they could seed themselves. Just because it says "crypto-secure" in the name doesn't mean that it'll be crypto-secure if it's seeded with time(). Therefore, "reproducible" vs "non-reproducible" looks like a good distinction to me.

I disagree here, in two respects:

First, whether or not a particular PRNG is cryptographically secure is an intrinsic property of the algorithm; whether it's "reproducible" or not is determined by the published API. In other words, the distinction between CSPRNG vs. non-CSPRNG is important to document because it's semantics that cannot be deduced by the user otherwise, and it is an important one for writing secure code because it tells you whether an attacker can predict future outputs based only on observing past outputs. "Reproducible" in the sense of seedable or not is trivially noted by inspection of the published API, and it is rather immaterial to writing secure code.

Cryptographically secure is not a property that I'm comfortable applying to an algorithm. You cannot say that you've made a cryptographically secure thing just because you've used all the right algorithms: you also have to use them right, and one of the most critical components of a cryptographically secure PRNG is its seed. It is a *feature* of a lot of modern CSPRNGs that you can't seed them:

You cannot seed or add entropy to std::random_device
You cannot seed or add entropy to CryptGenRandom
You can only add entropy to /dev/(u)random
You can only add entropy to BSD's arc4random

Just because we can expose a seed interface doesn't mean we should, and in this case I believe that it would go against the prime objective of providing secure random numbers.

If your attacker can observe your seeding once, chances are that they can observe your reseeding too; then, they can use their own implementation of the PRNG (whether CSPRNG or non-CSPRNG) and reproduce your pseudorandom sequence whether or not Swift exposes any particular API.

On Linux, the random devices are initially seeded with machine-specific but rather invariant data that makes /dev/urandom spit out predictable numbers. It is considered "seeded" after a root process writes POOL_SIZE bytes to it. On most implementations, this initial seed is stored on disk: when the computer shuts down, it reads POOL_SIZE bytes from /dev/urandom and saves it in a file, and the contents of that file is loaded back into /dev/urandom when the computer starts. A scenario where someone can read that file is certainly not less likely than a scenario where /dev/urandom was deleted. That doesn't mean that they have kernel code execution or that they can pry into your process, but they have a good shot at guessing your seed and subsequent RNG results if no stirring happens.

Secondly, I see no reason to justify the notion that, simply because a PRNG is cryptographically secure, we ought to hide the seeding initializer (because one has to exist internally anyway) from the public. Obviously, one use case for a deterministic PRNG is to get reproducible sequences of random-appearing values; this can be useful whether the underlying algorithm is cryptographically secure or not. There are innumerably many ways to use data generated from a CSPRNG in non-cryptographically secure ways and omitting or including a public seeding initializer does not change that; in other words, using a deterministic seed for a CSPRNG would be a bad idea in certain applications, but it's a deliberate act, and someone who would mistakenly do that is clearly incapable of *using* the output from the PRNG in a secure way either; put a third way, you would be hard pressed to find a situation where it's true that "if only Swift had not made the seeding initializer public, this author would have written secure code, but instead the only security hole that existed in the code was caused by the availability of a public seeding initializer mistakenly used." The point of having both explicitly instantiable PRNGs and a layer of simpler APIs like "Int.random()" is so that the less experienced user can get the "right thing" by default, and the experienced user can customize the behavior; any user that instantiates his or her own ChaCha20Random instance is already calling for the power user interface; it is reasonable to expose the underlying primitive operations (such as seeding) so long as there are legitimate uses for it.

Nothing prevents us from using the same algorithm for a CSPRNG that is safely pre-seeded and a PRNG that people seed themselves, mind you. However, especially when it comes to security, there is a strong responsibility to drive developers into a pit of success: the most obvious thing to do has to be the right one, and suggesting to cryptographically-unaware developers that they have everything they need to manage their own seed is not a step in that direction.

I'm not opposed to a ChaCha20Random type; I'm opposed to explicitly calling it cryptographically-secure, because it is not unless you know what to do with it. It is emphatically not far-fetched to imagine a developer who thinks that they can outdo the standard library by using their own ChaCha20Random instance after it's been seeded with time() if we let them know that it's "cryptographically secure". If you're a power user and you don't like the default, known-good CSPRNG, then you're hopefully good enough to know that ChaCha20 is considered a cryptographically-secure algorithm without help labels from the language, and you know how to operate it.

I'm fully aware of the myths surrounding /dev/urandom and /dev/random. /dev/urandom might never run out, but it is also possible for it not to be initialized at all, as in the case of some VM setups. In some older versions of iOS, /dev/[u]random is reportedly sandboxed out. On systems where it is available, it can also be deleted, since it is a file. The point is, all of these scenarios cause an error during seeding of a CSPRNG. The question is, how to proceed in the face of inability to access entropy. We must do something, because we cannot therefore return a cryptographically secure answer. Rare trapping on invocation of Int.random() or permanently waiting for a never-to-be-initialized /dev/urandom would be terrible to debug, but returning an optional or throwing all the time would be verbose. How to design this API?

If the only concern is that the system might not be initialized enough, I'd say that whatever returns an instance of a global, framework-seeded CSPRNG should return an Optional, and the random methods that use the global CSPRNG can trap and scream that the system is not initialized enough. If this is a likely error for you, you can check if the CSPRNG exists or not before jumping.

Also note that there is only one system for which Swift is officially distributed (Ubuntu 14.04) on which the only way to get entropy from the OS is to open a random device and read from it.

* What should the default CSPRNG be? There are good arguments for using a cryptographically secure device random. (In my proposed implementation, for device random, I use Security.framework on Apple platforms (because /dev/urandom is not guaranteed to be available due to the sandbox, IIUC). On Linux platforms, I would prefer to use getrandom() and avoid using file system APIs, but getrandom() is new and unsupported on some versions of Ubuntu that Swift supports. This is an issue in and of itself.) Now, a number of these facilities strictly limit or do not guarantee availability of more than a small number of random bytes at a time; they are recommended for seeding other PRNGs but *not* as a routine source of random numbers. Therefore, although device random should be available to users, it probably shouldn’t be the default for the Swift standard library as it could have negative consequences for the system as a whole. There follows the significant task of implementing a CSPRNG correctly and securely for the default PRNG.

Theo give a talk a few years ago <https://www.youtube.com/watch?v=aWmLWx8ut20&gt; on randomness and how these problems are approached in LibreSSL.

Certainly, we can learn a lot from those like Theo who've dealt with the issue. I'm not in a position to watch the talk at the moment; can you summarize what the tl;dr version of it is?

I saw it three years ago, so I don't remember all the details. The gist is that:

OpenBSD's random is available from extremely early in the boot process with reasonable entropy
LibreSSL includes OpenBSD's arc4random, and it's a "good" PRNG (which doesn't actually use ARC4)
That implementation of arc4random is good because it is fool-proof and it has basically no failure mode
Stirring is good, having multiple components take random numbers from the same source probably makes results harder to guess too
Getrandom/getentropy is in all ways better than reading from random devices

FƩlix

Ā·Ā·Ā·

Le 26 sept. 2017 Ơ 16:14, Xiaodi Wu <xiaodi.wu@gmail.com> a Ʃcrit :
On Tue, Sep 26, 2017 at 11:26 AM, FƩlix Cloutier <felixcloutier@icloud.com <mailto:felixcloutier@icloud.com>> wrote:

I think this is a good idea. I start asking questions about what our default generator for linux will be if we use Darwin’s arc4random(3). Do we use Glibc’s random()? If so, what do we seed it with?

- Alejandro

Ā·Ā·Ā·

On Oct 4, 2017, 6:26 PM -0500, Ben Cohen via swift-evolution <swift-evolution@swift.org>, wrote:

On Sep 30, 2017, at 3:23 PM, Chris Lattner via swift-evolution <swift-evolution@swift.org<mailto:swift-evolution@swift.org>> wrote:

On Sep 11, 2017, at 9:43 PM, Brent Royal-Gordon <brent@architechies.com<mailto:brent@architechies.com>> wrote:

On Sep 9, 2017, at 10:31 PM, Chris Lattner via swift-evolution <swift-evolution@swift.org<mailto:swift-evolution@swift.org>> wrote:

- I’d love to see several of the most common random kinds supported, and I agree it would be nice (but not required IMO) for the default to be cryptographically secure.

I would be very careful about choosing a "simple" solution. There is a log, sad history of languages trying to provide a "simple" random number generator and accidentally providing a powerful footgun instead. But:

- We should avoid the temptation to nuke this mosquito with a heavy handed solution designed to solve all of the world’s problems: For example, the C++ random number stuff is crazily over-general. The stdlib should aim to solve (e.g.) the top 3 most common cases, and let a more specialized external library solve the fully general problem (e.g. seed management, every distribution imaginable, etc).

That's not to say we need to have seven engines and twenty distributions like C++ does. The standard library is not a statistics package; it exists to provide basic abstractions and fundamental functionality. I don't think it should worry itself with distributions at all. I think it needs to provide:

1. The abstraction used to plug in different random number generators (i.e. an RNG protocol of some kind).

2. APIs on existing standard library types which perform basic randomness-related functions correctly—essentially, encapsulating Knuth. (Specifically, I think selecting a random element from a collection (which also covers generating a random integer in a range), shuffling a mutable collection, and generating a random float will do the trick.)

3. A default RNG with a conservative design that will sometimes be too slow, but will never be insufficiently random.

If you want to pick elements with a Poisson distribution, go get a statistics framework; if you want repeatable random numbers for testing, use a seedable PRNG from XCTest or some other test tools package. These can leverage the standard library's RNG protocol to work with existing random number generators or random number consumers.

+1 to this general plan!

This pretty much exactly matches my preferences.

If random numbers go into the std lib, they should being able to customize the source of randomness for speed or test reproducibility, but default to something sensible without the user having to know it’s configurable. On Darwin that default should be based on arc4random(3). The std lib doesn’t need to provide other non-default random sources. Non-random sources for testing should be part of test frameworks and plug in easily.

The proposal should include shuffle and random element from collection, which are much-requested and not really the controversial part so won't hold up the overall progress of the proposal.

(and no need for distributions other than uniform IMO, :fr::fish: or otherwise)

-Chris

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org<mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

I like this particular version. In particular, the choice of algorithms is,
afaict, correct and that is incredibly important. I had overlooked that
`arc4random` is cryptographically secure past a certain version of macOS,
but you are absolutely right. I am also on board with the fatal error
suggestion if random entropy is unavailable; I think it must be amply
documented, though.

I do think, however, that you're overloading too many things into the word
"random" when they're not the same. Take a look at Python, which is pretty
widely used for numerics. There's `rand` and `random` for getting a random
integer or floating-point value, and there's `choice` and `sample` for
choosing one or more values out of a collection without replacement. These
are sufficiently different tasks and don't all need to be called "random"
or satisfy the same requirement of the same protocol. Put another way, it's
absolutely *not* inconsistent for numeric types to have `random()` while
collection types have a differently named method.

By contrast, I think the great length of text trying to justify naming all
of these facilities `random` in order to parallel `first` and `last` shows
how the proposed design is comparatively weaker. You have to argue that (a)
`Int.random` shouldn't return an optional value because it'd be unwieldy,
and therefore `(0..<5).random` shouldn't either because it would then be
inconsistent; but (b) that `(0..<5).random` should be spelled and behave
like `(0..<5).first` and `(0..<5).last` even though the user must handle
empty collections totally differently because the return types are not the
same. Either `(0..<5).random` should behave analogously to `first` and
`last` or it should not. If it should, it only makes sense to return a
result of type `T?`. After all, if a collection doesn't have a `first`
item, then it can't have a `random` item. Put another way, having a `first`
item is a prerequisite to having a randomly selectable item. The behavior
of the Swift APIs would be very consistent if `first` returns `T?` but
`random` returns `T`. However, I agree that unwrapping `Int.random` every
time would be burdensome, and it would not make sense to have a type
support `random` but not have any instantiable values; therefore, returning
an optional value doesn't make sense, and it follows that `Int.random`
*shouldn't* behave like `first` or `last`.

Once you stop trying to make what Python calls `rand/randint` and
`choice/sample` have the same names, then finding a Swifty design for the
distinct facilities becomes much easier, and it suggests a pretty elegant
result (IMO):

[1, 2, 3, 4].choice // like `first` or `last`, this gets you a value of
type Int?
[1, 2, 3, 4].sampling(2) // like `prefix(2)` or `suffix(2)`, this gets you
a subsequence with at most two elements

Int.random // this gets you a random Int; or it may trap
Float.random // this gets you a random Float; or it may trap

With that, it also becomes clear why--and I agree with you--an independent
`Int.random(in: 0..<5)` is not necessary. `(0..<5).choice` is fine, and it
can now appropriately return a value of type `T?` because it no longer
needs to parallel `Int.random`.

* * *

More in the bikeshedding arena, I take issue with some of the names:

- I reiterate my comment that `Randomizable` is not the best name. There
are multiple dictionary definitions of "randomize" and one is "make
unpredictable, unsystematic, or random in order or arrangement." Wikipedia
gives at least five different contextual meanings for the word. What you're
doing here is specifically **random sampling** and we can do better to
clarify that, I think.

- While I agree that `RNG` can be cryptic, the alternative should be
`RandomNumberGenerator` (as it's called in other languages);
`RandomGenerator` is not quite accurate. Again, we're _consuming_
randomness to _generate_ numbers (or values of other type, based on the
result of a generated number). We're not _generating_ randomness.

Ā·Ā·Ā·

On Sun, Nov 5, 2017 at 6:33 PM, Alejandro Alonso <aalonso128@outlook.com> wrote:

[Proposal] Random Unification by Azoy Ā· Pull Request #760 Ā· apple/swift-evolution Ā· GitHub is the current API and
proposed solution.

- Alejandro

On Nov 5, 2017, 6:18 PM -0600, Xiaodi Wu <xiaodi.wu@gmail.com>, wrote:

My comments are directed to the "more up-to-date" document that you just
linked to in your reply to Jon. Is that one outdated? If so, can you send a
link to the updated proposal and implementation for which you're soliciting
feedback?

On Sun, Nov 5, 2017 at 6:12 PM, Alejandro Alonso <aalonso128@outlook.com> > wrote:

The proposal and implementation have the current updated API. The link I
sent Jon was the one I brought up a few weeks ago which is outdated now.
The proposal answers all of your questions. As for `.random` being a
function, some would argue that it behaves in the same way as `.first` and
`.last` which are properties.

- Alejandro

On Nov 5, 2017, 6:07 PM -0600, Xiaodi Wu <xiaodi.wu@gmail.com>, wrote:

A few quick thoughts:

I know that there's been some discussion that `(1...10).random` is the
best spelling, but I'd like to push back on that suggestion. When I want a
random number, I tend to think of the type I want first ("I want a random
integer") and then a range ("I want a random integer between a and b"), not
the other way around. My intuition is that `Int.random(in:)` will be more
discoverable, both on that basis and because it is more similar to other
languages' syntax (`Math.random` in JavaScript and `randint` in NumPy, for
example). It also has the advantage that the type is explicit, which I
think is particularly useful in this case because the value itself is,
well, random.

I would also argue that, `random` is most appropriately a method and not
a property; there's no hard and fast rule for this, but the fact that the
result is stochastic suggests (to me) that it's not a "property" of the
range (or, for that matter, of the type).

I would reiterate here my qualms about `Source` being the term used for a
generator. These types are not a _source_ of entropy but rather a
_consumer_ of entropy.

`UnsafeRandomSource` needs to be renamed; "unsafe" has a specific meaning
in Swift--that is, memory safety, and this is not it. Moreover, it's
questionable whether this protocol is useful in any sense. What useful
generic algorithms can one write with such a protocol?

`XoroshiroRandom` cannot be seeded by any `Numeric` value; depending on
the specific algorithm it needs a seed of a specific bit width. If you
default the shared instance to being seeded with an `Int` then you will
have to have distinct implementations for 32-bit and 64-bit platforms. This
is unadvisable. On that note, your `UnsafeRandomSource` needs to have an
associated type and not a generic `<T : Numeric>` for the seed.

The default random number generator should be cryptographically secure;
however, it's not clear to me that it should be device random.

I agree with others that alternative random number generators other than
the default RNG (and, if not default, possibly also the device RNG) should
be accommodated by the protocol hierarchy but not necessarily supplied in
the stdlib.

The term `Randomizable` means something specific which is not how it's
used in your proposed protocol.

There's still the open question, not answered, about how requesting an
instance of the hardware RNG behaves when there's insufficient or no
entropy. Does it return nil, throw, trap, or wait? The proposed API does
not clarify this point, although based on the method signature it cannot
return nil or throw. Trapping might be acceptable but I'd be interested to
hear your take as to why it is preferable.

On Sun, Nov 5, 2017 at 4:43 PM, Alejandro Alonso via swift-evolution < >> swift-evolution@swift.org> wrote:

For the proof of concept, I had accidentally deleted that one. I have a
more up to date one which was discussed a few weeks later.
Swift Random Unification Design Ā· GitHub

- Alejandro

On Nov 5, 2017, 4:37 PM -0600, Jonathan Hull <jhull@gbis.com>, wrote:

Is there a link to the writeup? The one in the quote 404s.

Thanks,
Jon

On Nov 5, 2017, at 2:10 PM, Alejandro Alonso via swift-evolution < >>> swift-evolution@swift.org> wrote:

Hello once again Swift evolution community. I have taken the time to
write up the proposal for this thread, and have provided an implementation
for it as well. I hope to once again get good feedback on the overall
proposal.

- Alejandro

On Sep 8, 2017, 11:52 AM -0500, Alejandro Alonso via swift-evolution < >>> swift-evolution@swift.org>, wrote:

Hello swift evolution, I would like to propose a unified approach to
`random()` in Swift. I have a simple implementation here
https://gist.github.com/Azoy/5d294148c8b97d20b96ee64f434bb4f5\. This
implementation is a simple wrapper over existing random functions so
existing code bases will not be affected. Also, this approach introduces a
new random feature for Linux users that give them access to upper bounds,
as well as a lower bound for both Glibc and Darwin users. This change would
be implemented within Foundation.

I believe this simple change could have a very positive impact on new
developers learning Swift and experienced developers being able to write
single random declarations.

I’d like to hear about your ideas on this proposal, or any
implementation changes if need be.

- Alejando

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Thanks for continuing to push this forward, Alejandro! I’m excited about the potential of having access to these APIs as part of the standard library. Here are a few comments on some different parts of the proposal:

1) For your RandomGenerator protocol, I’m not totally clear on the semantics of the next(_:) and next(_:upperBound:) methods. Do they both have zero as their lower bound, for example? I’m not sure it makes sense to have signed integers generated directly by an RNG—perhaps T: FixedWidthInteger & UnsignedInteger would be a more useful constraint. (Does it even need to be generic? What if RNGs just generate UInt32s?)

2) Can you say more about the purpose of the Randomizable protocol? How would we use that protocol in useful ways that we wouldn’t get from being able to select random values from ranges (half-open and closed) of FixedWidthInteger / BinaryFloatingPoint? My experience has been that a full-width random value is rarely what a user needs.

3) I agree with Xiaodi that Random should probably be a struct with a single shared instance, but I don’t think it should be internal. Hiding that shared RNG would make it hard for non-stdlib additions to have the same usage, as they would need to have completely separate implementations for the ā€œdefaultā€ and custom RNG versions.

4) I would also still suggest that the simplest version of random (that you use to get a value from a range or an element from a collection) should be a function, not a property. Collection properties like first, last, and count all represent facts that already exist about a collection, and don’t change unless the collection itself changes. Choosing a random element, on the other hand, is clearly going to be freshly performed on each call. In addition, with the notable exception of count, we try to ensure O(1) performance for properties, while random will be O(n) except in random-access collections. Finally, if it is a method, we can unify the two versions by providing a single method with the shared RNG as the default parameter.

5) To match the sorted() method, shuffled() should be on Sequence instead of Collection. I don’t think either shuffled() or shuffle() needs to be a protocol requirement, since there isn’t really any kind of customization necessary for different kinds of collections. Like the sorting algorithms, both could be regular extension methods.

6) I don’t know whether or not a consensus has formed around the correct spelling of the APIs for generating random values. From the proposal it looks like the preferred ways of getting a random value in a range would be to use the random property (or method) on a range or closed range:

    (0..<10).random // 7
    (0.0 ... 5.0).random // 4.112312

If that’s the goal, and we don’t want those values to be optional, we’ll need an implementation of random for floating-point ranges and an overload for fixed-width integer ranges. That said, I don’t think that style is as discoverable as having static methods or initializers available on the different types:

    Int.random(in: 0..<10)
    Double.random(in: 0.0 ... 5.0)
    // or maybe
    Int(randomIn: 0..<10)
    Double(randomIn: 0.0 ... 5.0)

(My only quibble with the initializer approach is that Bool would be awkward.)

In addition, this alternative approach could make creating random values more consistent with types that don’t work well in ranges:

    Data.random(bytes: 128)
    Color.random(r: 0...0, g: 0...1, b: 0...1, a: 1...1)

Ā·Ā·Ā·

————

Thanks again!
Nate

On Nov 5, 2017, at 6:33 PM, Alejandro Alonso via swift-evolution <swift-evolution@swift.org> wrote:

[Proposal] Random Unification by Azoy Ā· Pull Request #760 Ā· apple/swift-evolution Ā· GitHub is the current API and proposed solution.

- Alejandro

On Nov 5, 2017, 6:18 PM -0600, Xiaodi Wu <xiaodi.wu@gmail.com>, wrote:

My comments are directed to the "more up-to-date" document that you just linked to in your reply to Jon. Is that one outdated? If so, can you send a link to the updated proposal and implementation for which you're soliciting feedback?

On Sun, Nov 5, 2017 at 6:12 PM, Alejandro Alonso <aalonso128@outlook.com <mailto:aalonso128@outlook.com>> wrote:
The proposal and implementation have the current updated API. The link I sent Jon was the one I brought up a few weeks ago which is outdated now. The proposal answers all of your questions. As for `.random` being a function, some would argue that it behaves in the same way as `.first` and `.last` which are properties.

- Alejandro

On Nov 5, 2017, 6:07 PM -0600, Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>>, wrote:

A few quick thoughts:

I know that there's been some discussion that `(1...10).random` is the best spelling, but I'd like to push back on that suggestion. When I want a random number, I tend to think of the type I want first ("I want a random integer") and then a range ("I want a random integer between a and b"), not the other way around. My intuition is that `Int.random(in:)` will be more discoverable, both on that basis and because it is more similar to other languages' syntax (`Math.random` in JavaScript and `randint` in NumPy, for example). It also has the advantage that the type is explicit, which I think is particularly useful in this case because the value itself is, well, random.

I would also argue that, `random` is most appropriately a method and not a property; there's no hard and fast rule for this, but the fact that the result is stochastic suggests (to me) that it's not a "property" of the range (or, for that matter, of the type).

I would reiterate here my qualms about `Source` being the term used for a generator. These types are not a _source_ of entropy but rather a _consumer_ of entropy.

`UnsafeRandomSource` needs to be renamed; "unsafe" has a specific meaning in Swift--that is, memory safety, and this is not it. Moreover, it's questionable whether this protocol is useful in any sense. What useful generic algorithms can one write with such a protocol?

`XoroshiroRandom` cannot be seeded by any `Numeric` value; depending on the specific algorithm it needs a seed of a specific bit width. If you default the shared instance to being seeded with an `Int` then you will have to have distinct implementations for 32-bit and 64-bit platforms. This is unadvisable. On that note, your `UnsafeRandomSource` needs to have an associated type and not a generic `<T : Numeric>` for the seed.

The default random number generator should be cryptographically secure; however, it's not clear to me that it should be device random.

I agree with others that alternative random number generators other than the default RNG (and, if not default, possibly also the device RNG) should be accommodated by the protocol hierarchy but not necessarily supplied in the stdlib.

The term `Randomizable` means something specific which is not how it's used in your proposed protocol.

There's still the open question, not answered, about how requesting an instance of the hardware RNG behaves when there's insufficient or no entropy. Does it return nil, throw, trap, or wait? The proposed API does not clarify this point, although based on the method signature it cannot return nil or throw. Trapping might be acceptable but I'd be interested to hear your take as to why it is preferable.

On Sun, Nov 5, 2017 at 4:43 PM, Alejandro Alonso via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
For the proof of concept, I had accidentally deleted that one. I have a more up to date one which was discussed a few weeks later. Swift Random Unification Design Ā· GitHub

- Alejandro

On Nov 5, 2017, 4:37 PM -0600, Jonathan Hull <jhull@gbis.com <mailto:jhull@gbis.com>>, wrote:

Is there a link to the writeup? The one in the quote 404s.

Thanks,
Jon

On Nov 5, 2017, at 2:10 PM, Alejandro Alonso via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

Hello once again Swift evolution community. I have taken the time to write up the proposal for this thread, and have provided an implementation for it as well. I hope to once again get good feedback on the overall proposal.

- Alejandro

On Sep 8, 2017, 11:52 AM -0500, Alejandro Alonso via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>>, wrote:

Hello swift evolution, I would like to propose a unified approach to `random()` in Swift. I have a simple implementation here https://gist.github.com/Azoy/5d294148c8b97d20b96ee64f434bb4f5\. This implementation is a simple wrapper over existing random functions so existing code bases will not be affected. Also, this approach introduces a new random feature for Linux users that give them access to upper bounds, as well as a lower bound for both Glibc and Darwin users. This change would be implemented within Foundation.

I believe this simple change could have a very positive impact on new developers learning Swift and experienced developers being able to write single random declarations.

I’d like to hear about your ideas on this proposal, or any implementation changes if need be.

- Alejando

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

I still would propose the random generator be designed as a random binary stream, with the user functionality being exposed on particular types (Int, Float, Array, etc) and not on that generator interface. For this I just picked a "read" method off of InputStream.

One nice thing about a simple, swappable random source is the ability to switch out a deterministic and/or repeatable source of randomness for the system while under test.

Also since the random data may be coming from system entropy and not from an algorithm, I'd recommend calling it RandomSource.

protocol RandomSource {
  // Reads up to a given number of bytes into a given buffer.
  func read(_ buffer: UnsafeMutablePointer<UInt8>, maxLength: Int)
}

-DW

Ā·Ā·Ā·

On Nov 5, 2017, at 5:33 PM, Alejandro Alonso via swift-evolution <swift-evolution@swift.org> wrote:

[Proposal] Random Unification by Azoy Ā· Pull Request #760 Ā· apple/swift-evolution Ā· GitHub is the current API and proposed solution.

- Alejandro

On Nov 5, 2017, 6:18 PM -0600, Xiaodi Wu <xiaodi.wu@gmail.com>, wrote:

My comments are directed to the "more up-to-date" document that you just linked to in your reply to Jon. Is that one outdated? If so, can you send a link to the updated proposal and implementation for which you're soliciting feedback?

On Sun, Nov 5, 2017 at 6:12 PM, Alejandro Alonso <aalonso128@outlook.com <mailto:aalonso128@outlook.com>> wrote:
The proposal and implementation have the current updated API. The link I sent Jon was the one I brought up a few weeks ago which is outdated now. The proposal answers all of your questions. As for `.random` being a function, some would argue that it behaves in the same way as `.first` and `.last` which are properties.

- Alejandro

On Nov 5, 2017, 6:07 PM -0600, Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>>, wrote:

A few quick thoughts:

I know that there's been some discussion that `(1...10).random` is the best spelling, but I'd like to push back on that suggestion. When I want a random number, I tend to think of the type I want first ("I want a random integer") and then a range ("I want a random integer between a and b"), not the other way around. My intuition is that `Int.random(in:)` will be more discoverable, both on that basis and because it is more similar to other languages' syntax (`Math.random` in JavaScript and `randint` in NumPy, for example). It also has the advantage that the type is explicit, which I think is particularly useful in this case because the value itself is, well, random.

I would also argue that, `random` is most appropriately a method and not a property; there's no hard and fast rule for this, but the fact that the result is stochastic suggests (to me) that it's not a "property" of the range (or, for that matter, of the type).

I would reiterate here my qualms about `Source` being the term used for a generator. These types are not a _source_ of entropy but rather a _consumer_ of entropy.

`UnsafeRandomSource` needs to be renamed; "unsafe" has a specific meaning in Swift--that is, memory safety, and this is not it. Moreover, it's questionable whether this protocol is useful in any sense. What useful generic algorithms can one write with such a protocol?

`XoroshiroRandom` cannot be seeded by any `Numeric` value; depending on the specific algorithm it needs a seed of a specific bit width. If you default the shared instance to being seeded with an `Int` then you will have to have distinct implementations for 32-bit and 64-bit platforms. This is unadvisable. On that note, your `UnsafeRandomSource` needs to have an associated type and not a generic `<T : Numeric>` for the seed.

The default random number generator should be cryptographically secure; however, it's not clear to me that it should be device random.

I agree with others that alternative random number generators other than the default RNG (and, if not default, possibly also the device RNG) should be accommodated by the protocol hierarchy but not necessarily supplied in the stdlib.

The term `Randomizable` means something specific which is not how it's used in your proposed protocol.

There's still the open question, not answered, about how requesting an instance of the hardware RNG behaves when there's insufficient or no entropy. Does it return nil, throw, trap, or wait? The proposed API does not clarify this point, although based on the method signature it cannot return nil or throw. Trapping might be acceptable but I'd be interested to hear your take as to why it is preferable.

On Sun, Nov 5, 2017 at 4:43 PM, Alejandro Alonso via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
For the proof of concept, I had accidentally deleted that one. I have a more up to date one which was discussed a few weeks later. Swift Random Unification Design Ā· GitHub

- Alejandro

On Nov 5, 2017, 4:37 PM -0600, Jonathan Hull <jhull@gbis.com <mailto:jhull@gbis.com>>, wrote:

Is there a link to the writeup? The one in the quote 404s.

Thanks,
Jon

On Nov 5, 2017, at 2:10 PM, Alejandro Alonso via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

Hello once again Swift evolution community. I have taken the time to write up the proposal for this thread, and have provided an implementation for it as well. I hope to once again get good feedback on the overall proposal.

- Alejandro

On Sep 8, 2017, 11:52 AM -0500, Alejandro Alonso via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>>, wrote:

Hello swift evolution, I would like to propose a unified approach to `random()` in Swift. I have a simple implementation here https://gist.github.com/Azoy/5d294148c8b97d20b96ee64f434bb4f5\. This implementation is a simple wrapper over existing random functions so existing code bases will not be affected. Also, this approach introduces a new random feature for Linux users that give them access to upper bounds, as well as a lower bound for both Glibc and Darwin users. This change would be implemented within Foundation.

I believe this simple change could have a very positive impact on new developers learning Swift and experienced developers being able to write single random declarations.

I’d like to hear about your ideas on this proposal, or any implementation changes if need be.

- Alejando

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

I can agree with mostly everything in here. I think `.random` on `RandomAccessCollection` should mimic the current design with `.first` and `.last` by returning an optional. In terms of the naming of this, we have to look at how python structures the call site. `random.choice([1, 2, 3, 4])` To me this reads, random choice within this array. This works because of how it’s called. With the proposed solution, we are calling to get a random element directly from the array. So I stick by with naming this random.

On the subject of bike shedding the names, I can agree to use `RandomNumberGenerator` whole heartily. As for `Randomizable`, I agree there might be a better name for this, but the question is what?

- Alejandro

Ā·Ā·Ā·

On Nov 5, 2017, 7:56 PM -0600, Xiaodi Wu <xiaodi.wu@gmail.com>, wrote:
I like this particular version. In particular, the choice of algorithms is, afaict, correct and that is incredibly important. I had overlooked that `arc4random` is cryptographically secure past a certain version of macOS, but you are absolutely right. I am also on board with the fatal error suggestion if random entropy is unavailable; I think it must be amply documented, though.

I do think, however, that you're overloading too many things into the word "random" when they're not the same. Take a look at Python, which is pretty widely used for numerics. There's `rand` and `random` for getting a random integer or floating-point value, and there's `choice` and `sample` for choosing one or more values out of a collection without replacement. These are sufficiently different tasks and don't all need to be called "random" or satisfy the same requirement of the same protocol. Put another way, it's absolutely *not* inconsistent for numeric types to have `random()` while collection types have a differently named method.

By contrast, I think the great length of text trying to justify naming all of these facilities `random` in order to parallel `first` and `last` shows how the proposed design is comparatively weaker. You have to argue that (a) `Int.random` shouldn't return an optional value because it'd be unwieldy, and therefore `(0..<5).random` shouldn't either because it would then be inconsistent; but (b) that `(0..<5).random` should be spelled and behave like `(0..<5).first` and `(0..<5).last` even though the user must handle empty collections totally differently because the return types are not the same. Either `(0..<5).random` should behave analogously to `first` and `last` or it should not. If it should, it only makes sense to return a result of type `T?`. After all, if a collection doesn't have a `first` item, then it can't have a `random` item. Put another way, having a `first` item is a prerequisite to having a randomly selectable item. The behavior of the Swift APIs would be very consistent if `first` returns `T?` but `random` returns `T`. However, I agree that unwrapping `Int.random` every time would be burdensome, and it would not make sense to have a type support `random` but not have any instantiable values; therefore, returning an optional value doesn't make sense, and it follows that `Int.random` *shouldn't* behave like `first` or `last`.

Once you stop trying to make what Python calls `rand/randint` and `choice/sample` have the same names, then finding a Swifty design for the distinct facilities becomes much easier, and it suggests a pretty elegant result (IMO):

[1, 2, 3, 4].choice // like `first` or `last`, this gets you a value of type Int?
[1, 2, 3, 4].sampling(2) // like `prefix(2)` or `suffix(2)`, this gets you a subsequence with at most two elements

Int.random // this gets you a random Int; or it may trap
Float.random // this gets you a random Float; or it may trap

With that, it also becomes clear why--and I agree with you--an independent `Int.random(in: 0..<5)` is not necessary. `(0..<5).choice` is fine, and it can now appropriately return a value of type `T?` because it no longer needs to parallel `Int.random`.

* * *

More in the bikeshedding arena, I take issue with some of the names:

- I reiterate my comment that `Randomizable` is not the best name. There are multiple dictionary definitions of "randomize" and one is "make unpredictable, unsystematic, or random in order or arrangement." Wikipedia gives at least five different contextual meanings for the word. What you're doing here is specifically **random sampling** and we can do better to clarify that, I think.

- While I agree that `RNG` can be cryptic, the alternative should be `RandomNumberGenerator` (as it's called in other languages); `RandomGenerator` is not quite accurate. Again, we're _consuming_ randomness to _generate_ numbers (or values of other type, based on the result of a generated number). We're not _generating_ randomness.

On Sun, Nov 5, 2017 at 6:33 PM, Alejandro Alonso <aalonso128@outlook.com<mailto:aalonso128@outlook.com>> wrote:
[Proposal] Random Unification by Azoy Ā· Pull Request #760 Ā· apple/swift-evolution Ā· GitHub is the current API and proposed solution.

- Alejandro

On Nov 5, 2017, 6:18 PM -0600, Xiaodi Wu <xiaodi.wu@gmail.com<mailto:xiaodi.wu@gmail.com>>, wrote:
My comments are directed to the "more up-to-date" document that you just linked to in your reply to Jon. Is that one outdated? If so, can you send a link to the updated proposal and implementation for which you're soliciting feedback?

On Sun, Nov 5, 2017 at 6:12 PM, Alejandro Alonso <aalonso128@outlook.com<mailto:aalonso128@outlook.com>> wrote:
The proposal and implementation have the current updated API. The link I sent Jon was the one I brought up a few weeks ago which is outdated now. The proposal answers all of your questions. As for `.random` being a function, some would argue that it behaves in the same way as `.first` and `.last` which are properties.

- Alejandro

On Nov 5, 2017, 6:07 PM -0600, Xiaodi Wu <xiaodi.wu@gmail.com<mailto:xiaodi.wu@gmail.com>>, wrote:
A few quick thoughts:

I know that there's been some discussion that `(1...10).random` is the best spelling, but I'd like to push back on that suggestion. When I want a random number, I tend to think of the type I want first ("I want a random integer") and then a range ("I want a random integer between a and b"), not the other way around. My intuition is that `Int.random(in:)` will be more discoverable, both on that basis and because it is more similar to other languages' syntax (`Math.random` in JavaScript and `randint` in NumPy, for example). It also has the advantage that the type is explicit, which I think is particularly useful in this case because the value itself is, well, random.

I would also argue that, `random` is most appropriately a method and not a property; there's no hard and fast rule for this, but the fact that the result is stochastic suggests (to me) that it's not a "property" of the range (or, for that matter, of the type).

I would reiterate here my qualms about `Source` being the term used for a generator. These types are not a _source_ of entropy but rather a _consumer_ of entropy.

`UnsafeRandomSource` needs to be renamed; "unsafe" has a specific meaning in Swift--that is, memory safety, and this is not it. Moreover, it's questionable whether this protocol is useful in any sense. What useful generic algorithms can one write with such a protocol?

`XoroshiroRandom` cannot be seeded by any `Numeric` value; depending on the specific algorithm it needs a seed of a specific bit width. If you default the shared instance to being seeded with an `Int` then you will have to have distinct implementations for 32-bit and 64-bit platforms. This is unadvisable. On that note, your `UnsafeRandomSource` needs to have an associated type and not a generic `<T : Numeric>` for the seed.

The default random number generator should be cryptographically secure; however, it's not clear to me that it should be device random.

I agree with others that alternative random number generators other than the default RNG (and, if not default, possibly also the device RNG) should be accommodated by the protocol hierarchy but not necessarily supplied in the stdlib.

The term `Randomizable` means something specific which is not how it's used in your proposed protocol.

There's still the open question, not answered, about how requesting an instance of the hardware RNG behaves when there's insufficient or no entropy. Does it return nil, throw, trap, or wait? The proposed API does not clarify this point, although based on the method signature it cannot return nil or throw. Trapping might be acceptable but I'd be interested to hear your take as to why it is preferable.

On Sun, Nov 5, 2017 at 4:43 PM, Alejandro Alonso via swift-evolution <swift-evolution@swift.org<mailto:swift-evolution@swift.org>> wrote:
For the proof of concept, I had accidentally deleted that one. I have a more up to date one which was discussed a few weeks later. Swift Random Unification Design Ā· GitHub

- Alejandro

On Nov 5, 2017, 4:37 PM -0600, Jonathan Hull <jhull@gbis.com<mailto:jhull@gbis.com>>, wrote:
Is there a link to the writeup? The one in the quote 404s.

Thanks,
Jon

On Nov 5, 2017, at 2:10 PM, Alejandro Alonso via swift-evolution <swift-evolution@swift.org<mailto:swift-evolution@swift.org>> wrote:

Hello once again Swift evolution community. I have taken the time to write up the proposal for this thread, and have provided an implementation for it as well. I hope to once again get good feedback on the overall proposal.

- Alejandro

On Sep 8, 2017, 11:52 AM -0500, Alejandro Alonso via swift-evolution <swift-evolution@swift.org<mailto:swift-evolution@swift.org>>, wrote:
Hello swift evolution, I would like to propose a unified approach to `random()` in Swift. I have a simple implementation here https://gist.github.com/Azoy/5d294148c8b97d20b96ee64f434bb4f5\. This implementation is a simple wrapper over existing random functions so existing code bases will not be affected. Also, this approach introduces a new random feature for Linux users that give them access to upper bounds, as well as a lower bound for both Glibc and Darwin users. This change would be implemented within Foundation.

I believe this simple change could have a very positive impact on new developers learning Swift and experienced developers being able to write single random declarations.

I’d like to hear about your ideas on this proposal, or any implementation changes if need be.

- Alejando

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org<mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org<mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

Sorry I’ve been gone for a while, I had to do a lot of traveling.

1. Initially I made this thinking that developers had the power to determine their own lower bound. The current implementation uses the integer’s min value as a lower bound. If it makes sense to only allow unsigned integers from an RNG, then I’m perfectly fine with. I do disagree when you say that it should only generate UInt32s. The current approach allows, lets say mt19337 and mt19337-64, to be used within one generator. So if you wanted a UInt32, mt19337 would be used, and if you asked for a UInt64, mt19337-64 would be used.

2. The Randomizable protocol isn’t always used with integers. Think Date.random or Color.random. These types of values are difficult to express with ranges. Randomizable solves this issue.

3. I’ve made the adjustment necessary for this.

4. So while I can see your point for this, it would break the consistency with Randomizable’s random property. You could argue that we could make this property a function itself, but I think most will agree that Int.random is a cleaner api than Int.random().

5. I’ve made the adjustment necessary for this.

6. I actually forgot to implement the random api for the ranges where Bound: BinaryFloatingPoint. While implementing this, I realized these would never fail and would always return a non-optional. So, I decided making the other Countable ranges non-optional. (0 ..< 10).random would return a non-optional, (0.0 ..< 10.0).random would return a non-optional, and Array(0 ..< 10).random would return an optional. I can agree that something like (0 ..< 10).random is hard to discover, so I added Int.random(in: 0 ..< 10) (along with BinaryFloatingPoint). However, these are not requirements of Randomizable. I think these methods would benefit more if they were extension methods:

extension Randomizable where Self: FixedWidthInteger, Self.Stride: SignedInteger {
public static func random(
in range: Countable{Closed}Range,
using generator: RandomNumberGenerator
) -> Self {
return range.random(using: generator)
}
}

extension Randomizable where Self: BinaryFloatingPointer {
public static func random(
in range: {Closed}Range,
using generator: RandomNumberGenerator
) -> Self {
return range.random
}
}

I think external types that wish to do something similar, like Data.random(bytes: 128), could extend Randomizable with their own custom needs. The stdlib would at this point provide all the features needed to make this happen very simply for something like Data.random(bytes: 128).

- Alejandro

Ā·Ā·Ā·

On Nov 5, 2017, 10:44 PM -0600, Nate Cook <natecook@apple.com>, wrote:
Thanks for continuing to push this forward, Alejandro! I’m excited about the potential of having access to these APIs as part of the standard library. Here are a few comments on some different parts of the proposal:

1) For your RandomGenerator protocol, I’m not totally clear on the semantics of the next(_:) and next(_:upperBound:) methods. Do they both have zero as their lower bound, for example? I’m not sure it makes sense to have signed integers generated directly by an RNG—perhaps T: FixedWidthInteger & UnsignedInteger would be a more useful constraint. (Does it even need to be generic? What if RNGs just generate UInt32s?)

2) Can you say more about the purpose of the Randomizable protocol? How would we use that protocol in useful ways that we wouldn’t get from being able to select random values from ranges (half-open and closed) of FixedWidthInteger / BinaryFloatingPoint? My experience has been that a full-width random value is rarely what a user needs.

3) I agree with Xiaodi that Random should probably be a struct with a single shared instance, but I don’t think it should be internal. Hiding that shared RNG would make it hard for non-stdlib additions to have the same usage, as they would need to have completely separate implementations for the ā€œdefaultā€ and custom RNG versions.

4) I would also still suggest that the simplest version of random (that you use to get a value from a range or an element from a collection) should be a function, not a property. Collection properties like first, last, and count all represent facts that already exist about a collection, and don’t change unless the collection itself changes. Choosing a random element, on the other hand, is clearly going to be freshly performed on each call. In addition, with the notable exception of count, we try to ensure O(1) performance for properties, while random will be O(n) except in random-access collections. Finally, if it is a method, we can unify the two versions by providing a single method with the shared RNG as the default parameter.

5) To match the sorted() method, shuffled() should be on Sequence instead of Collection. I don’t think either shuffled() or shuffle() needs to be a protocol requirement, since there isn’t really any kind of customization necessary for different kinds of collections. Like the sorting algorithms, both could be regular extension methods.

6) I don’t know whether or not a consensus has formed around the correct spelling of the APIs for generating random values. From the proposal it looks like the preferred ways of getting a random value in a range would be to use the random property (or method) on a range or closed range:

Ā Ā Ā Ā (0..<10).random // 7
Ā Ā Ā Ā (0.0 ... 5.0).random // 4.112312

If that’s the goal, and we don’t want those values to be optional, we’ll need an implementation of random for floating-point ranges and an overload for fixed-width integer ranges. That said, I don’t think that style is as discoverable as having static methods or initializers available on the different types:

Ā Ā Ā Ā Int.random(in: 0..<10)
Ā Ā Ā Ā Double.random(in: 0.0 ... 5.0)
Ā Ā Ā Ā // or maybe
Ā Ā Ā Ā Int(randomIn: 0..<10)
Ā Ā Ā Ā Double(randomIn: 0.0 ... 5.0)

(My only quibble with the initializer approach is that Bool would be awkward.)

In addition, this alternative approach could make creating random values more consistent with types that don’t work well in ranges:

Ā Ā Ā Ā Data.random(bytes: 128)
Ā Ā Ā Ā Color.random(r: 0...0, g: 0...1, b: 0...1, a: 1...1)

————

Thanks again!
Nate

On Nov 5, 2017, at 6:33 PM, Alejandro Alonso via swift-evolution <swift-evolution@swift.org<mailto:swift-evolution@swift.org>> wrote:

[Proposal] Random Unification by Azoy Ā· Pull Request #760 Ā· apple/swift-evolution Ā· GitHub is the current API and proposed solution.

- Alejandro

On Nov 5, 2017, 6:18 PM -0600, Xiaodi Wu <xiaodi.wu@gmail.com<mailto:xiaodi.wu@gmail.com>>, wrote:
My comments are directed to the "more up-to-date" document that you just linked to in your reply to Jon. Is that one outdated? If so, can you send a link to the updated proposal and implementation for which you're soliciting feedback?

On Sun, Nov 5, 2017 at 6:12 PM, Alejandro Alonso <aalonso128@outlook.com<mailto:aalonso128@outlook.com>> wrote:
The proposal and implementation have the current updated API. The link I sent Jon was the one I brought up a few weeks ago which is outdated now. The proposal answers all of your questions. As for `.random` being a function, some would argue that it behaves in the same way as `.first` and `.last` which are properties.

- Alejandro

On Nov 5, 2017, 6:07 PM -0600, Xiaodi Wu <xiaodi.wu@gmail.com<mailto:xiaodi.wu@gmail.com>>, wrote:
A few quick thoughts:

I know that there's been some discussion that `(1...10).random` is the best spelling, but I'd like to push back on that suggestion. When I want a random number, I tend to think of the type I want first ("I want a random integer") and then a range ("I want a random integer between a and b"), not the other way around. My intuition is that `Int.random(in:)` will be more discoverable, both on that basis and because it is more similar to other languages' syntax (`Math.random` in JavaScript and `randint` in NumPy,
for example). It also has the advantage that the type is explicit, which I think is particularly useful in this case because the value itself is, well, random.

I would also argue that, `random` is most appropriately a method and not a property; there's no hard and fast rule for this, but the fact that the result is stochastic suggests (to me) that it's not a "property" of the range (or, for that matter, of the type).

I would reiterate here my qualms about `Source` being the term used for a generator. These types are not a _source_ of entropy but rather a _consumer_ of entropy.

`UnsafeRandomSource` needs to be renamed; "unsafe" has a specific meaning in Swift--that is, memory safety, and this is not it. Moreover, it's questionable whether this protocol is useful in any sense. What useful generic algorithms can one write with such a protocol?

`XoroshiroRandom` cannot be seeded by any `Numeric` value; depending on the specific algorithm it needs a seed of a specific bit width. If you default the shared instance to being seeded with an `Int` then you will have to have distinct implementations for 32-bit and 64-bit platforms. This is unadvisable. On that note, your `UnsafeRandomSource` needs to have an associated type and not a generic `<T : Numeric>` for the seed.

The default random number generator should be cryptographically secure; however, it's not clear to me that it should be device random.

I agree with others that alternative random number generators other than the default RNG (and, if not default, possibly also the device RNG) should be accommodated by the protocol hierarchy but not necessarily supplied in the stdlib.

The term `Randomizable` means something specific which is not how it's used in your proposed protocol.

There's still the open question, not answered, about how requesting an instance of the hardware RNG behaves when there's insufficient or no entropy. Does it return nil, throw, trap, or wait? The proposed API does not clarify this point, although based on the method signature it cannot return nil or throw. Trapping might be acceptable but I'd be interested to hear your take as to why it is preferable.

On Sun, Nov 5, 2017 at 4:43 PM, Alejandro Alonso via swift-evolution <swift-evolution@swift.org<mailto:swift-evolution@swift.org>> wrote:
For the proof of concept, I had accidentally deleted that one. I have a more up to date one which was discussed a few weeks later. Swift Random Unification Design Ā· GitHub

- Alejandro

On Nov 5, 2017, 4:37 PM -0600, Jonathan Hull <jhull@gbis.com<mailto:jhull@gbis.com>>, wrote:
Is there a link to the writeup? The one in the quote 404s.

Thanks,
Jon

On Nov 5, 2017, at 2:10 PM, Alejandro Alonso via swift-evolution <swift-evolution@swift.org<mailto:swift-evolution@swift.org>> wrote:

Hello once again Swift evolution community. I have taken the time to write up the proposal for this thread, and have provided an implementation for it as well. I hope to once again get good feedback on the overall proposal.

- Alejandro

On Sep 8, 2017, 11:52 AM -0500, Alejandro Alonso via swift-evolution <swift-evolution@swift.org<mailto:swift-evolution@swift.org>>, wrote:
Hello swift evolution, I would like to propose a unified approach to `random()` in Swift. I have a simple implementation here https://gist.github.com/Azoy/5d294148c8b97d20b96ee64f434bb4f5\. This implementation is a simple wrapper over existing random functions so existing code bases will not be affected. Also, this approach introduces a new random feature for Linux users that give them access to upper bounds, as well as a lower bound for both Glibc and Darwin users. This change would be implemented within Foundation.

I believe this simple change could have a very positive impact on new developers learning Swift and experienced developers being able to write single random declarations.

I’d like to hear about your ideas on this proposal, or any implementation changes if need be.

- Alejando

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org<mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org<mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org<mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

It's possible to use a CSPRNG-grade algorithm and seed it once to get a
reproducible sequence, but when you use it as a CSPRNG, you typically feed
entropy back into it at nondeterministic points to ensure that even if you
started with a bad seed, you'll eventually get to an alright state. Unless
you keep track of when entropy was mixed in and what the values were,
you'll never get a reproducible CSPRNG.

We would give developers a false sense of security if we provided them
with CSPRNG-grade algorithms that we called CSPRNGs and that they could
seed themselves. Just because it says "crypto-secure" in the name doesn't
mean that it'll be crypto-secure if it's seeded with time(). Therefore,
"reproducible" vs "non-reproducible" looks like a good distinction to me.

I disagree here, in two respects:

First, whether or not a particular PRNG is cryptographically secure is an
intrinsic property of the algorithm; whether it's "reproducible" or not is
determined by the published API. In other words, the distinction between
CSPRNG vs. non-CSPRNG is important to document because it's semantics that
cannot be deduced by the user otherwise, and it is an important one for
writing secure code because it tells you whether an attacker can predict
future outputs based only on observing past outputs. "Reproducible" in the
sense of seedable or not is trivially noted by inspection of the published
API, and it is rather immaterial to writing secure code.

Cryptographically secure is not a property that I'm comfortable applying
to an algorithm. You cannot say that you've made a cryptographically secure
thing just because you've used all the right algorithms: you also have to
use them right, and one of the most critical components of a
cryptographically secure PRNG is its seed.

A cryptographically secure algorithm isn’t sufficient, but it is necessary.
That’s why it’s important to mark them as such. If I'm a careful developer,
then it is absolutely important to me to know that I’m using a PRNG with a
cryptographically secure algorithm, and that the particular implementation
of that algorithm is correct and secure.

It is a *feature* of a lot of modern CSPRNGs that you can't seed them:

   - You cannot seed or add entropy to std::random_device

Although std::random_device may in practice be backed by a software CSPRNG,
IIUC, the intention is that it can provide access to a hardware
non-deterministic source when available.

   - You cannot seed or add entropy to CryptGenRandom
   - You can only add entropy to /dev/(u)random
   - You can only add entropy to BSD's arc4random

Ah, I see. I think we mean different things when we say PRNG. A PRNG is an
entirely deterministic algorithm; the output is non-random and the
algorithm itself requires no entropy. If a PRNG is seeded with a random
sequence of bits, its output can "appear" to be random. A CSPRNG is a PRNG
that fulfills certain criteria such that its output can be appropriate for
use in cryptographic applications in place of a truly random sequence *if*
the input to the CSPRNG is itself random.

The examples you give above *incorporate* a CSPRNG, environment entropy,
and a set of rules about when to mix in additional entropy in order to
produce output indistinguishable from a random sequence, but they are *not*
themselves really *pseudorandom* generators because they are not
deterministic. Not only do such sources of random numbers not require an
interface to allow seeding, they do not even have to be publicly
instantiable: Swift need only expose a single thread-safe instance (or an
instance per thread) of a single type that provides access to
CryptGenRandom/urandom/arc4random, since after all the output of multiple
instances of that type should be statistically indistinguishable from the
output of only one.

What I was trying to respond to, by contrast, is the design of a hierarchy
of protocols CSPRNG : PRNG (or, in Alejandro's proposal, UnsafeRandomSource
: RandomSource) and the appropriate APIs to expose on each. This is
entirely inapplicable to your examples. It stands to reason that a
non-instantiable source of random numbers does not require a protocol of
its own (a hypothetical RNG : CSPRNG), since there is no reason to
implement (if done correctly) more than a single publicly non-instantiable
singleton type that could conform to it. For that matter, the concrete type
itself probably doesn't need *any* public API at all. Instead, extensions
to standard library types such as Int that implement conformance to the
protocol that Alejandro names "Randomizable" could call internal APIs to
provide all the necessary functionality, and third-party types that need to
conform to "Randomizable" could then in turn use `Int.random()` or
`Double.random()` to implement their own conformance. In fact, the concrete
random number generator type doesn't need to be public at all. All public
interaction could be through APIs such as `Int.random()`.

Just because we can expose a seed interface doesn't mean we should, and in
this case I believe that it would go against the prime objective of
providing secure random numbers.

If we're talking about a Swift interface to a non-deterministic source of
random numbers like urandom or arc4random, then, as I write above, not only
do I agree that it doesn't need to be seedable, it also does not need to be
instantiable at all, does not need to conform to a protocol that
specifically requires the semantics of a non-deterministic source, does not
need to expose any public interface whatsoever, and doesn't itself even
need to be public. (Does it even need to be a type, as opposed to simply a
free function?)

In fact, having reasoned through all of this, we can split the design task
into two. The most essential part, which definitely should be part of the
stdlib, would be an internal interface to a cryptographically secure
platform-specific entropy source, a public protocol named something like
Randomizable (to be bikeshedded), and the appropriate implementations on
Boolean, binary integer, and floating point types to conform them to
Randomizable so that users can write `Bool.random()` or `Int.random()`. The
second part, which can be a separate proposal or even a standalone core
library or third-party library, would be the protocols and concrete types
that implement pseudorandom number generators, allowing for reproducible
pseudorandom sequences. In other words, instead of PRNGs and CSPRNGs being
the primitives on which `Int.random()` is implemented; `Int.random()`
should be the standard library primitive which allows PRNGs and CSPRNGs to
be seeded.

If your attacker can observe your seeding once, chances are that they can
observe your reseeding too; then, they can use their own implementation of
the PRNG (whether CSPRNG or non-CSPRNG) and reproduce your pseudorandom
sequence whether or not Swift exposes any particular API.

On Linux, the random devices are initially seeded with machine-specific
but rather invariant data that makes /dev/urandom spit out predictable
numbers. It is considered "seeded" after a root process writes POOL_SIZE
bytes to it. On most implementations, this initial seed is stored on disk:
when the computer shuts down, it reads POOL_SIZE bytes from /dev/urandom
and saves it in a file, and the contents of that file is loaded back into
/dev/urandom when the computer starts. A scenario where someone can read
that file is certainly not less likely than a scenario where /dev/urandom
was deleted. That doesn't mean that they have kernel code execution or that
they can pry into your process, but they have a good shot at guessing your
seed and subsequent RNG results if no stirring happens.

Sorry, I don't understand what you're getting at here. Again, I'm talking
about deterministic algorithms, not non-deterministic sources of random
numbers.

Secondly, I see no reason to justify the notion that, simply because a PRNG

is cryptographically secure, we ought to hide the seeding initializer
(because one has to exist internally anyway) from the public. Obviously,
one use case for a deterministic PRNG is to get reproducible sequences of
random-appearing values; this can be useful whether the underlying
algorithm is cryptographically secure or not. There are innumerably many
ways to use data generated from a CSPRNG in non-cryptographically secure
ways and omitting or including a public seeding initializer does not change
that; in other words, using a deterministic seed for a CSPRNG would be a
bad idea in certain applications, but it's a deliberate act, and someone
who would mistakenly do that is clearly incapable of *using* the output
from the PRNG in a secure way either; put a third way, you would be hard
pressed to find a situation where it's true that "if only Swift had not
made the seeding initializer public, this author would have written secure
code, but instead the only security hole that existed in the code was
caused by the availability of a public seeding initializer mistakenly
used." The point of having both explicitly instantiable PRNGs and a layer
of simpler APIs like "Int.random()" is so that the less experienced user
can get the "right thing" by default, and the experienced user can
customize the behavior; any user that instantiates his or her own
ChaCha20Random instance is already calling for the power user interface; it
is reasonable to expose the underlying primitive operations (such as
seeding) so long as there are legitimate uses for it.

Nothing prevents us from using the same algorithm for a CSPRNG that is
safely pre-seeded and a PRNG that people seed themselves, mind you.
However, especially when it comes to security, there is a strong
responsibility to drive developers into a pit of success: the most obvious
thing to do has to be the right one, and suggesting to
cryptographically-unaware developers that they have everything they need to
manage their own seed is not a step in that direction.

I'm not opposed to a ChaCha20Random type; I'm opposed to explicitly
calling it cryptographically-secure, because it is not unless you know what
to do with it. It is emphatically not far-fetched to imagine a developer
who thinks that they can outdo the standard library by using their own
ChaCha20Random instance after it's been seeded with time() if we let them
know that it's "cryptographically secure". If you're a power user and you
don't like the default, known-good CSPRNG, then you're hopefully good
enough to know that ChaCha20 is considered a cryptographically-secure
algorithm without help labels from the language, and you know how to
operate it.

I'm fully aware of the myths surrounding /dev/urandom and /dev/random.
/dev/urandom might never run out, but it is also possible for it not to be
initialized at all, as in the case of some VM setups. In some older
versions of iOS, /dev/[u]random is reportedly sandboxed out. On systems
where it is available, it can also be deleted, since it is a file. The
point is, all of these scenarios cause an error during seeding of a CSPRNG.
The question is, how to proceed in the face of inability to access entropy.
We must do something, because we cannot therefore return a
cryptographically secure answer. Rare trapping on invocation of
Int.random() or permanently waiting for a never-to-be-initialized
/dev/urandom would be terrible to debug, but returning an optional or
throwing all the time would be verbose. How to design this API?

If the only concern is that the system might not be initialized enough,
I'd say that whatever returns an instance of a global, framework-seeded
CSPRNG should return an Optional, and the random methods that use the
global CSPRNG can trap and scream that the system is not initialized
enough. If this is a likely error for you, you can check if the CSPRNG
exists or not before jumping.

Also note that there is only one system for which Swift is officially
distributed (Ubuntu 14.04) on which the only way to get entropy from the OS
is to open a random device and read from it.

Again, I'm not only talking about urandom. As far as I'm aware, every API
to retrieve cryptographically secure sequences of random bits on every
platform for which Swift is distributed can potentially return an error
instead of random bits. The question is, what design for our API is the
most sensible way to deal with this contingency? On rethinking, I do
believe that consistently returning an Optional is the best way to go about
it, allowing the user to either (a) supply a deterministic fallback; (b)
raise an error of their own choosing; or (c) trap--all with a minimum of
fuss. This seems very Swifty to me.

* What should the default CSPRNG be? There are good arguments for using a

cryptographically secure device random. (In my proposed implementation, for
device random, I use Security.framework on Apple platforms (because
/dev/urandom is not guaranteed to be available due to the sandbox, IIUC).
On Linux platforms, I would prefer to use getrandom() and avoid using file
system APIs, but getrandom() is new and unsupported on some versions of
Ubuntu that Swift supports. This is an issue in and of itself.) Now, a
number of these facilities strictly limit or do not guarantee availability
of more than a small number of random bytes at a time; they are recommended
for seeding other PRNGs but *not* as a routine source of random numbers.
Therefore, although device random should be available to users, it probably
shouldn’t be the default for the Swift standard library as it could have
negative consequences for the system as a whole. There follows the
significant task of implementing a CSPRNG correctly and securely for the
default PRNG.

Theo give a talk a few years ago
<https://www.youtube.com/watch?v=aWmLWx8ut20&gt; on randomness and how
these problems are approached in LibreSSL.

Certainly, we can learn a lot from those like Theo who've dealt with the
issue. I'm not in a position to watch the talk at the moment; can you
summarize what the tl;dr version of it is?

I saw it three years ago, so I don't remember all the details. The gist is
that:

   - OpenBSD's random is available from extremely early in the boot
   process with reasonable entropy

   - LibreSSL includes OpenBSD's arc4random, and it's a "good" PRNG
   (which doesn't actually use ARC4)
   - That implementation of arc4random is good because it is fool-proof
   and it has basically no failure mode
   - Stirring is good, having multiple components take random numbers
   from the same source probably makes results harder to guess too
   - Getrandom/getentropy is in all ways better than reading from random
   devices

Vigorously agree on all points. Thanks for the summary.

Ā·Ā·Ā·

On Wed, Sep 27, 2017 at 00:18 FƩlix Cloutier <felixcloutier@icloud.com> wrote:

Le 26 sept. 2017 Ơ 16:14, Xiaodi Wu <xiaodi.wu@gmail.com> a Ʃcrit :
On Tue, Sep 26, 2017 at 11:26 AM, FƩlix Cloutier <felixcloutier@icloud.com > > wrote: