A very recent example I encountered is random number generation.
While there is a protocol RandomNumberGenerator
in Swift, it unfortunately is defined like this:
public protocol RandomNumberGenerator {
mutating func next() -> UInt64
}
Being tied to UInt64
makes it useful as a low-level back-end for higher-level RNGs, but rather useless for direct use in most real-world situations, where you'd want to random-sample from Int
, Float
, Bool
, or the like.
Luckily there are individual methods on Float
and the like, sprinkled all over the stdlib, which are defined along the lines of this:
static func random(in range: Range<Float>) -> Float
While this is nice for situations where your code is very tightly specified, bound to concrete types and you're only interested in uniform distributions, it ends up being rather useless when one or more of the following criteria are met …
- … you need to randomly sample from a type provided as generic argument
- … you actually care about correctness and want to write unit tests, without having to write ad-hoc RNG wrappers for each of those methods (especially the
static func
is problematic with tests).
- … you need your values to be sample from any non-uniform distributions (gaussian e.g.)
- … you need your execution to be deterministic (by seeding the RNG), like for testing
- …
And unless you're just doing casual coding, prototyping or anything else where correctness, generality or re-usability isn't actually important you can be rather certain that at least one of the above will apply to the code you're writing.
As such I find the func random(in:)
of stdlib to be more of an anti-pattern than a solution. They lure you into writing code that ends up hard to maintain and test and impossible to decouple later on.
If we however had a way to implement a protocol multiple times, for specific types each (i.e. generically) we could expand the existing "back-end" into something like this:
public protocol RandomNumberGenerator {
mutating func next() -> UInt64
}
extension RandomNumberGenerator {
mutating func sample<T, D: Distribution<T>>(from distribution: D) -> T {
distribution.sample(from: &self)
}
mutating func sample<T, D: Distribution<T>>(from distribution: D, within range: Range<T>) -> T {
// ...
}
}
Next we would add a generic(!) Distribution
protocol like this:
public protocol Distribution<T> {
func sample<R: RandomNumberGenerator>(from rng: inout R) -> T
}
… which would open up the possibility of user-land swift packages providing implementations of all kinds of distributions (Bernoulli, Beta, Binomial, Categorical, Cauchy, Chi, Chi-Squared, Dirichlet, Discrete-Uniform, Erlang, Exponential, Fisher-Snedecor, Gamma, Geometric, Hypergeometric, Inverse-Gamma, Log-Normal, Multinomial, Normal, Pareto, Poisson, Students, Triangular, Uniform, Weibull, just to name a few).
The stdlib would then provide a default distribution that would sample from a numerically uniform distribution, and with a range appropriate to the given type T
.
public struct DefaultDistribution {
// ...
}
extension DefaultDistribution: Distribution<Bool> {
func sample<R: RandomNumberGenerator>(from rng: inout R) -> Bool {
// ...
}
}
extension DefaultDistribution: Distribution<Int> {
func sample<R: RandomNumberGenerator>(from rng: inout R) -> Int {
// ...
}
}
extension DefaultDistribution: Distribution<Float> {
func sample<R: RandomNumberGenerator>(from rng: inout R) -> Float {
// ...
}
}
… which would greatly improve ergonomics by using it in a convenience extension
like this:
extension RandomNumberGenerator {
mutating func random<T>() -> T
where DefaultDistribution: Distribution<T>
{
return self.sample(from: DefaultDistribution())
}
mutating func random<T>(range: Range<T>) -> T
where DefaultDistribution: Distribution<T>
{
return self.sample(from: DefaultDistribution(), within: range)
}
}
This would allow us to …
- … randomly sample from a type provided as generic argument
- … effortlessly write unit tests, without having to write ad-hoc RNG wrappers, like before.
- … sample from any non-uniform distributions (gaussian e.g.)
- … have one's execution be deterministic, assuming seedable RNGs being made available.
In other words it would solve all the pain points listed above for the existing and limited API.
I don't see a way to build a similarly flexible (and efficient!) implementation without multiple conformances to a single protocol (as in "generic protocol").
The key here is being able to combine N
random-sources with M
distributions resulting in up to N × M
combinations from just N + M
implementations with zero run-time or dynamic-dispatch overhead, thanks to generic protocol conformance.
cc @DevAndArtist