Sampling collections


(Milos Rankovic) #1

In the playground:

    "works?".capitalizedString // error: value of type 'String' has no member 'capitalizedString'

… but:

    import Foundation
    “works!”.capitalizedString // “Works!”

Would it not be nice if all the following likewise worked:

    import Foundation
    
    (1..<4).sample
    [1,2,3].sample
    "abc".characters.sample
    ["a": 1, "b": 2, "c": 3].sample

Like so many users of Swift, I have extensions <http://stackoverflow.com/a/30285125/1409907> of IntegerType, ClosedInterval and CollectionType that avail me of the above methods and their family, but I’d much rather if such extensions came with Darwin or at least Foundation.

milos


(Jens Alfke) #2

It sounds like you’re suggesting that a “sample” property (that returns a randomly chosen element) should be added to the standard library? You could suggest that on the swift-evolution list, though IMHO it seems like a pretty obscure feature that not many users would need.

I don’t understand the comparison with capitalizedString; this works after importing Foundation because of the bridging between String and NSString, which is an artifact of the Mac/iOS Swift 2.x’s dependency on the Cocoa frameworks.

—Jens


(Brent Royal-Gordon) #3

    import Foundation
    
    (1..<4).sample
    [1,2,3].sample
    "abc".characters.sample
    ["a": 1, "b": 2, "c": 3].sample

Like so many users of Swift, I have extensions of IntegerType, ClosedInterval and CollectionType that avail me of the above methods and their family, but I’d much rather if such extensions came with Darwin or at least Foundation.

I don't think a `sample` property or method is the right approach here. It would be using some sort of global source of random numbers, which means that:

* It's not testable or repeatable
* It needs to be synchronized with other threads
* It can't be configured to use a different random number generator

Personally, I would eventually like to see something like this in the standard library:

  protocol RandomizerProtocol {
    mutating func randomBytes(_ n: Int) -> [UInt8]
    // or possibly something involving a generic-length tuple, for speed
  }
  extension RandomizerProtocol {
    // for coin flips
    mutating func randomChoice() -> Bool { ... }
    // for choosing a random element
    mutating func randomChoice<CollectionType: RandomAccessCollection>(from collection: CollectionType) -> CollectionType.Element { ... }
    // for choosing a random value from an uncountable range (e.g. Range<Double>)
    mutating func randomChoice<Element: Strideable>(from range: Range<Element>) -> Element { ... }
  }
  struct Randomizer: RandomizerProtocol {
    init(state: [UInt8]) { ... }
    init() { self.init(state: somethingToMakeAGoodRandomState()) }

    mutating func randomBytes(_ n: Int) -> [UInt8] {
      // akin to arc4random()
    }
  }

This would allow you to confine a random number generator to a particular thread, swap one implementation for another, or inject one with a fixed starting state as a dependency to make tests predictable. A design like this one works around the problems I described nicely.

However, I don't think this is a high enough priority to address right now. This is borderline out-of-scope as "major new library functionality", and there's so much stuff to do that is truly core to the language, this simply seems like a distraction.

···

--
Brent Royal-Gordon
Architechies


(Milos Rankovic) #4

Thank you, Jens, for your response.

I do however disagree with both points you are making. First, you write that sampling collection elements at random is:

a pretty obscure feature

But how can this be? When you teach students how to implement a card playing game in Swift, how do you shuffle the deck? And when you test your code, do you not feed your methods with randomly generated and sampled simulated data, or do so at random intervals? And when you’re simply checking out an idea in the playground, do you not want randomly sampled or reshuffled inputs? Should any of these activities qualify as obscure?

As for:

I don’t understand the comparison with capitalizedString; this works after importing Foundation because...

Indeed, nothing after that “because” would help understand what I meant by the comparison. It is the fact of the import that I was trying to highlight. That `Foundation` extends fundamental Standard Library types and protocols (like `String` in this case). The ObjC–Swift bridge is relevant here only in the sense that I would also like sampling methods added to `NSArray`, `NSSet` or `NSDictionary`...

At present, when we need a source of random bits on Apple’s platforms, we dip into `Darwin` or `GameplayKit` frameworks. This is fine, and even if it wasn’t, it is unlikely to change (even when new RNG algorithms get introduced).

What I would personally like to see, however (and what I was wondering the community feels about), is that one of these frameworks extends Standard Library data types and protocols with this functionality, which most of us gets the taste of right with our first encounters with computer programming and which we continue to rely on throughout our careers.

milos

···

On 10 Apr 2016, at 17:33, Jens Alfke <jens@mooseyard.com> wrote:

It sounds like you’re suggesting that a “sample” property (that returns a randomly chosen element) should be added to the standard library? You could suggest that on the swift-evolution list, though IMHO it seems like a pretty obscure feature that not many users would need.

I don’t understand the comparison with capitalizedString; this works after importing Foundation because of the bridging between String and NSString, which is an artifact of the Mac/iOS Swift 2.x’s dependency on the Cocoa frameworks.

—Jens

In the playground:

    "works?".capitalizedString // error: value of type 'String' has no member 'capitalizedString'

… but:

    import Foundation
    “works!”.capitalizedString // “Works!”

Would it not be nice if all the following likewise worked:

    import Foundation
    
    (1..<4).sample
    [1,2,3].sample
    "abc".characters.sample
    ["a": 1, "b": 2, "c": 3].sample

Like so many users of Swift, I have extensions <http://stackoverflow.com/a/30285125/1409907> of IntegerType, ClosedInterval and CollectionType that avail me of the above methods and their family, but I’d much rather if such extensions came with Darwin or at least Foundation.

milos


(Erica Sadun) #5

I personally would vote against this. I do not think it's the role of a core language to worry about things like distributions, bias, and sampling.

At the same time, I agree it's a very common task for playgrounds. I've developed a lot of material for everything from random colors and shapes to placeholder APIs to shuffles.

Best regards,

-- E

···

On Apr 10, 2016, at 2:00 PM, Milos Rankovic via swift-users <swift-users@swift.org> wrote:

Thank you, Jens, for your response.

I do however disagree with both points you are making. First, you write that sampling collection elements at random is:

a pretty obscure feature

But how can this be? When you teach students how to implement a card playing game in Swift, how do you shuffle the deck? And when you test your code, do you not feed your methods with randomly generated and sampled simulated data, or do so at random intervals? And when you’re simply checking out an idea in the playground, do you not want randomly sampled or reshuffled inputs? Should any of these activities qualify as obscure?


(Jens Alfke) #6

The ObjC–Swift bridge is relevant here only in the sense that I would also like sampling methods added to `NSArray`, `NSSet` or `NSDictionary`…

Any library or program can add methods to any class using extensions. You can easily implement your own `sample` property. There may be a Swift library somewhere that provides one; all you’d have to do is import it.

If you want to implement it yourself, you can call the C functions `random` or `arc4random` directly from Swift. (You may need to add an #include to your bridging header in Xcode.)

What I would personally like to see, however (and what I was wondering the community feels about), is that one of these frameworks extends Standard Library data types and protocols with this functionality

So far Apple hasn’t added Swift-specific functionality to system frameworks; the frameworks are in Objective-C or C, and the Swift compiler and runtime bridge to that. What you’re suggesting would go the other direction, with a framework offering custom API wrappers. Maybe that will happen in the near future.

—Jens

···

On Apr 10, 2016, at 1:00 PM, Milos Rankovic <milos@milos-and-slavica.net> wrote:


(Dave Yost) #7

Every package that wraps C functions should be accompanied by a higher-level package that wraps the raw interface in best Swift fashion, IMO.

A C package without higher-level Swift wrappers is an invitation to chaos as a zillion people publish competing higher-level wrappers.

···

On 2016-04-10, at 3:12 PM, Jens Alfke via swift-users <swift-users@swift.org> wrote:

On Apr 10, 2016, at 1:00 PM, Milos Rankovic <milos@milos-and-slavica.net> wrote:

The ObjC–Swift bridge is relevant here only in the sense that I would also like sampling methods added to `NSArray`, `NSSet` or `NSDictionary`…

Any library or program can add methods to any class using extensions. You can easily implement your own `sample` property. There may be a Swift library somewhere that provides one; all you’d have to do is import it.

If you want to implement it yourself, you can call the C functions `random` or `arc4random` directly from Swift. (You may need to add an #include to your bridging header in Xcode.)

What I would personally like to see, however (and what I was wondering the community feels about), is that one of these frameworks extends Standard Library data types and protocols with this functionality

So far Apple hasn’t added Swift-specific functionality to system frameworks; the frameworks are in Objective-C or C, and the Swift compiler and runtime bridge to that. What you’re suggesting would go the other direction, with a framework offering custom API wrappers. Maybe that will happen in the near future.

—Jens
_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users


(Jens Alfke) #8

Sounds good, although that wasn’t really what I was talking about.

Is this a problem? Are there a lot of C-wrapping Swift libraries that don’t provide idiomatic & safe Swift APIs? I haven’t run across any.

—Jens

···

On Apr 10, 2016, at 4:36 PM, Dave Yost <Dave@Yost.com> wrote:

Every package that wraps C functions should be accompanied by a higher-level package that wraps the raw interface in best Swift fashion, IMO.
A C package without higher-level Swift wrappers is an invitation to chaos as a zillion people publish competing higher-level wrappers.


(Jacob Bandes-Storch) #9

I encourage anyone thinking about PRNG APIs to check out what C++ STL has
to offer: http://en.cppreference.com/w/cpp/numeric/random

And this analysis/extension of it:
http://www.pcg-random.org/posts/ease-of-use-without-loss-of-power.html

Jacob

···

On Sun, Apr 10, 2016 at 6:40 PM, Brent Royal-Gordon via swift-users < swift-users@swift.org> wrote:

> import Foundation
>
> (1..<4).sample
> [1,2,3].sample
> "abc".characters.sample
> ["a": 1, "b": 2, "c": 3].sample
>
> Like so many users of Swift, I have extensions of IntegerType,
ClosedInterval and CollectionType that avail me of the above methods and
their family, but I’d much rather if such extensions came with Darwin or at
least Foundation.

I don't think a `sample` property or method is the right approach here. It
would be using some sort of global source of random numbers, which means
that:

* It's not testable or repeatable
* It needs to be synchronized with other threads
* It can't be configured to use a different random number generator

Personally, I would eventually like to see something like this in the
standard library:

        protocol RandomizerProtocol {
                mutating func randomBytes(_ n: Int) -> [UInt8]
                // or possibly something involving a generic-length tuple,
for speed
        }
        extension RandomizerProtocol {
                // for coin flips
                mutating func randomChoice() -> Bool { ... }
                // for choosing a random element
                mutating func randomChoice<CollectionType:
>(from collection: CollectionType) ->
CollectionType.Element { ... }
                // for choosing a random value from an uncountable range
(e.g. Range<Double>)
                mutating func randomChoice<Element: Strideable>(from
range: Range<Element>) -> Element { ... }
        }
        struct Randomizer: RandomizerProtocol {
                init(state: [UInt8]) { ... }
                init() { self.init(state:
somethingToMakeAGoodRandomState()) }

                mutating func randomBytes(_ n: Int) -> [UInt8] {
                        // akin to arc4random()
                }
        }

This would allow you to confine a random number generator to a particular
thread, swap one implementation for another, or inject one with a fixed
starting state as a dependency to make tests predictable. A design like
this one works around the problems I described nicely.

However, I don't think this is a high enough priority to address right
now. This is borderline out-of-scope as "major new library functionality",
and there's so much stuff to do that is truly core to the language, this
simply seems like a distraction.

--
Brent Royal-Gordon
Architechies

_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users


(Milos Rankovic) #10

Why do you mention “the role of a core language” here? That was explicitly not the ambition of my question. I’m talking about extending the Standard Library types and protocols in the Foundation framework (as this is already done on a large scale). Or, if this is what you mean by “core language”, how does capitalising strings according to the rules of grammar of every language on the planet qualify as any more fitting the domain of the core language?

milos

···

On 10 Apr 2016, at 21:23, Erica Sadun <erica@ericasadun.com> wrote:

I do not think it's the role of a core language to worry about things like distributions, bias, and sampling.


(Milos Rankovic) #11

You can easily implement your own `sample` property.

The very first email of this thread has a link to my example implementation. Here it is again if you mist it: http://stackoverflow.com/a/30285125/1409907… however, my whole point is that I’d prefer if this important feature came with `Foundation`.

So far Apple hasn’t added Swift-specific functionality to system frameworks…

As I already stressed, I certainly do not imagine this to be “Swift-specific”, nor do I see any reason it would need to be. Also, it looks like you’ve missed my previous email in this thread where I give examples of current Foundation extensions of Standard Library types, and where I also make the bare bones of my wish-list rather more explicit.

Please note, though, that my original mail was simply trying to find out if there are people who also wished they had this functionality available upon importing Foundation; if they too would enjoy using it in playgrounds, when teaching, or while testing their code… Imagine opening a blank playground and typing something silly like the following, while fully expecting there to be a `sample` method on arrays, just as there is `componentsSeparatedByString` on strings:

import Foundation

extension String {
  var define: String? {
    return DCSCopyTextDefinition(nil, self, CFRangeMake(0, utf16.count)).map{
      $0.takeRetainedValue() as String
    }
  }
  var trail: String {
    var trail = [self]
    while let word = trail.last?.define?.componentsSeparatedByString(" ").sample {
      trail.append(word)
    }
    return trail.joinWithSeparator(" ") + "!"
  }
}

for i in 1...100 {
  print("Random".trail)
}

milos

···

On 10 Apr 2016, at 23:12, Jens Alfke <jens@mooseyard.com> wrote:


(Erica Sadun) #12

While I don't think general random sources are a good fit for core functionality, apparently, NSRandomSpecifier exists: https://developer.apple.com/library/mac/documentation/Cocoa/Reference/Foundation/Classes/NSRandomSpecifier_Class/index.html#//apple_ref/occ/cl/NSRandomSpecifier

Other material I consulted:
Standard Library: https://en.wikipedia.org/wiki/Standard_library
Foundation: https://developer.apple.com/library/mac/documentation/Cocoa/Reference/Foundation/ObjC_classic/
GameplayKit Randomization: https://developer.apple.com/library/ios/documentation/General/Conceptual/GameplayKit_Guide/RandomSources.html#//apple_ref/doc/uid/TP40015172-CH9-SW1

-- E

···

On Apr 10, 2016, at 2:39 PM, Milos Rankovic <milos@milos-and-slavica.net> wrote:

On 10 Apr 2016, at 21:23, Erica Sadun <erica@ericasadun.com <mailto:erica@ericasadun.com>> wrote:

I do not think it's the role of a core language to worry about things like distributions, bias, and sampling.

Why do you mention “the role of a core language” here? That was explicitly not the ambition of my question. I’m talking about extending the Standard Library types and protocols in the Foundation framework (as this is already done on a large scale). Or, if this is what you mean by “core language”, how does capitalising strings according to the rules of grammar of every language on the planet qualify as any more fitting the domain of the core language?

milos


(Jens Alfke) #13

As I already stressed, I certainly do not imagine this to be “Swift-specific”, nor do I see any reason it would need to be.

It sounds like you’re asking for a `sample` property to be added to NSArray, NSDictionary, NSSet, etc. You could certainly request that Apple do that, by filing a request at http://bugreport.apple.com, but don’t expect a reply; Apple’s framework teams are notoriously opaque.

Putting on my framework-designer hat, I’d argue that “random” is a broad concept with several possible implementations. Which RNG does `sample` use? Pick a cryptographic one and it might be too slow for some use cases; pick a fast one and it'd be insufficiently random, making it dangerous to use for anything related to security. The right answer might be to have a RNG protocol, with several implementations backed by different generators, that exposes a method like `randomElement(Collection)`.

Also, it looks like you’ve missed my previous email in this thread where I give examples of current Foundation extensions of Standard Library types, and where I also make the bare bones of my wish-list rather more explicit.

The way those are implemented is a weird hack, so they’re not actually good examples of what you’re asking for.

It’s not that Apple's Foundation framework contains any extensions to Swift; Foundation is lower-level and I don’t believe it has any knowledge at all of Swift. So this is not the same effect as when you import a Swift library to get extensions. Rather, importing Foundation is a hardwired signal to the Swift compiler to activate the implicit bridging between Swift’s String class and Foundation’s NSString (and likewise for Array/NSArray, etc.)

In Swift 3 this will supposedly change: these APIs will be added directly to the Swift classes, removing the need for bridging.

—Jens

···

On Apr 10, 2016, at 6:01 PM, Milos Rankovic <milos@milos-and-slavica.net> wrote:


(Milos Rankovic) #14

I don't think general random sources are a good fit for core functionality

I’m sorry, Erica, I still do not understand how your comments about “core functionality” reflect on my original question – have you seen it?

Certainly, there is plenty of precedent where Foundation extends Standard Library types and protocols:

// Foundation

extension String {
    public func enumerateLinguisticTagsInRange…
}

// CoreGraphics

extension Double {
    public init(_ value: CGFloat)
}

// Darwin

func yn(n: Int, _ x: Double) -> Double //...which are the bessel functions of first and second kind!

What I’m talking about would not look out of place with linguistic tags and bassel functions:

// Foundation

extension UnsignedIntegerType {
    static var random: Self
}

extension ClosedInterval where Bound : UnsignedIntegerType {
    var random: Bound
}

extension ClosedInterval where Bound : SignedIntegerType {
    var random: Bound
}

extension CollectionType where Index.Distance == Int {
    var sample: Generator.Element?
}

… which we could use by:

    import Foundation
    
    (1..<4).sample
    [1,2,3].sample
    "abc".characters.sample
    ["a": 1, "b": 2, "c": 3].sample

milos

···

On 10 Apr 2016, at 22:16, Erica Sadun <erica@ericasadun.com> wrote:

On 10 Apr 2016, at 22:16, Erica Sadun <erica@ericasadun.com> wrote:

On Apr 10, 2016, at 2:39 PM, Milos Rankovic <milos@milos-and-slavica.net <mailto:milos@milos-and-slavica.net>> wrote:

On 10 Apr 2016, at 21:23, Erica Sadun <erica@ericasadun.com <mailto:erica@ericasadun.com>> wrote:

I do not think it's the role of a core language to worry about things like distributions, bias, and sampling.

Why do you mention “the role of a core language” here? That was explicitly not the ambition of my question. I’m talking about extending the Standard Library types and protocols in the Foundation framework (as this is already done on a large scale). Or, if this is what you mean by “core language”, how does capitalising strings according to the rules of grammar of every language on the planet qualify as any more fitting the domain of the core language?

milos

While I don't think general random sources are a good fit for core functionality, apparently, NSRandomSpecifier exists: https://developer.apple.com/library/mac/documentation/Cocoa/Reference/Foundation/Classes/NSRandomSpecifier_Class/index.html#//apple_ref/occ/cl/NSRandomSpecifier

Other material I consulted:
Standard Library: https://en.wikipedia.org/wiki/Standard_library
Foundation: https://developer.apple.com/library/mac/documentation/Cocoa/Reference/Foundation/ObjC_classic/
GameplayKit Randomization: https://developer.apple.com/library/ios/documentation/General/Conceptual/GameplayKit_Guide/RandomSources.html#//apple_ref/doc/uid/TP40015172-CH9-SW1

-- E


(Milos Rankovic) #15

And yet we have `arc4random` family of functions which most people use in the kind of scenarios I refer to. The security argument is important, but I feel we sometimes reach for it too quickly. Just how will NOT implementing sampling on collections prevent someone from basing their security strategy on arc4 algorithm. Consider how indicative of their work would that be; how many more glaring security holes are they likely to leave! And are we saying that the obscure path to this algorithm somewhere inside `Darwin` is a virtue? Protecting the uninitiated from a dangerous technology?

I’m sorry we are spending so much time discussing why this may be difficult for *someone* (because it likely won’t be us) to implement. The fact is that random bits will have to come from the frameworks beyond Standard Library, but if there is will, I cannot imagine it would be too difficult to bring them to bear on the core datatypes and protocols. My question was always if there is such will; if people would like the feature to be there competently implemented and vetted by the community…

milos

···

On 11 Apr 2016, at 02:17, Jens Alfke <jens@mooseyard.com> wrote:

I’d argue that “random” is a broad concept with several possible implementations. Which RNG does `sample` use? Pick a cryptographic one and it might be too slow for some use cases; pick a fast one and it'd be insufficiently random, making it dangerous to use for anything related to security.


(Milos Rankovic) #16

Thanks, Jacob, for the links. Apple did take steps in this direction by spoiling us with a choice of random sources in the GameplayKit. I’m sure that after that initial effort, the GameplayKit team will continue to bring more power to the randomisation part of the framework.

Only, that is not what I had in mind. Once you care about the distinction between congruential and Mersenne sources, you are likely not to mind having to deal with a more involved framework. My point is that this is VERY often far too involved! That the “middle ground” Apple is striking with ARC4 algorithm and a publicly accessible system source is indicative that there is a more “popular” need for such functionality where random merely has to look random… Much joy is to be found below that low bar. For example, I believe that a small family of basic sampling properties and methods would quickly become favourite among the learners of Swift and those that are teaching them.

milos

···

On 11 Apr 2016, at 02:42, Jacob Bandes-Storch <jtbandes@gmail.com> wrote:

I encourage anyone thinking about PRNG APIs to check out what C++ STL has to offer: http://en.cppreference.com/w/cpp/numeric/random

And this analysis/extension of it: http://www.pcg-random.org/posts/ease-of-use-without-loss-of-power.html

Jacob

On Sun, Apr 10, 2016 at 6:40 PM, Brent Royal-Gordon via swift-users <swift-users@swift.org <mailto:swift-users@swift.org>> wrote:
> import Foundation
>
> (1..<4).sample
> [1,2,3].sample
> "abc".characters.sample
> ["a": 1, "b": 2, "c": 3].sample
>
> Like so many users of Swift, I have extensions of IntegerType, ClosedInterval and CollectionType that avail me of the above methods and their family, but I’d much rather if such extensions came with Darwin or at least Foundation.

I don't think a `sample` property or method is the right approach here. It would be using some sort of global source of random numbers, which means that:

* It's not testable or repeatable
* It needs to be synchronized with other threads
* It can't be configured to use a different random number generator

Personally, I would eventually like to see something like this in the standard library:

        protocol RandomizerProtocol {
                mutating func randomBytes(_ n: Int) -> [UInt8]
                // or possibly something involving a generic-length tuple, for speed
        }
        extension RandomizerProtocol {
                // for coin flips
                mutating func randomChoice() -> Bool { ... }
                // for choosing a random element
                mutating func randomChoice<CollectionType: RandomAccessCollection>(from collection: CollectionType) -> CollectionType.Element { ... }
                // for choosing a random value from an uncountable range (e.g. Range<Double>)
                mutating func randomChoice<Element: Strideable>(from range: Range<Element>) -> Element { ... }
        }
        struct Randomizer: RandomizerProtocol {
                init(state: [UInt8]) { ... }
                init() { self.init(state: somethingToMakeAGoodRandomState()) }

                mutating func randomBytes(_ n: Int) -> [UInt8] {
                        // akin to arc4random()
                }
        }

This would allow you to confine a random number generator to a particular thread, swap one implementation for another, or inject one with a fixed starting state as a dependency to make tests predictable. A design like this one works around the problems I described nicely.

However, I don't think this is a high enough priority to address right now. This is borderline out-of-scope as "major new library functionality", and there's so much stuff to do that is truly core to the language, this simply seems like a distraction.

--
Brent Royal-Gordon
Architechies

_______________________________________________
swift-users mailing list
swift-users@swift.org <mailto:swift-users@swift.org>
https://lists.swift.org/mailman/listinfo/swift-users


(Jens Alfke) #17

Well, write up a proposal and submit it <https://github.com/apple/swift-evolution/blob/master/process.md> to swift-evolution and let people discuss it.

I’m sorry we are spending so much time discussing why this may be difficult for *someone* (because it likely won’t be us) to implement.

Welcome to the world of bike-shedding. It’s not difficult to implement this, it’s difficult to design, because people have different needs and expectations. For a feature to go into a core library, there needs to be a strong enough need and there also needs to be agreement about how it should behave.

—Jens


(Milos Rankovic) #18

Love that :slight_smile:

milos

···

On 11 Apr 2016, at 02:57, Jens Alfke <jens@mooseyard.com> wrote:

Welcome to the world of bike-shedding.