Incorrect Fisher-Yates shuffle in example code

The example implementation of the Fisher-Yates shuffle found here
<https://github.com/apple/example-package-fisheryates/blob/master/Sources/FisherYates/Fisher-Yates_Shuffle.swift>
on
Apple’s GitHub (and linked from swift.org/package-manager) has some
problems. Stripping it down to just the Swift 4 version, the code is:

public extension MutableCollection where Index == Int, IndexDistance == Int
{
    mutating func shuffle() {
        guard count > 1 else { return }

        for i in 0..<count - 1 {
            let j = random(count - i) + i
            guard i != j else { continue }
            swapAt(i, j)
        }
    }
}

The main issues are:

1) It assumes that indices are 0-based.
2) It assumes that indices can be randomly accessed by addition.

The former means it does not work for ArraySlice, and the latter means it
won’t work for all custom types. Additionally, the “count” property (which
is only guaranteed to be O(n) here) is accessed inside the loop, thus
making the algorithm potentially O(n²).

To fix everything, we’ll want RandomAccessCollection conformance. Once we
add that, we no longer need “Index == Int”. The result looks like:

public extension MutableCollection where Self: RandomAccessCollection,
IndexDistance == Int {
    mutating func shuffle() {
        for n in 0 ..< count-1 {
            let i = index(startIndex, offsetBy: n)
            let j = index(i, offsetBy: random(count-n))
            swapAt(i, j)
        }
    }
}

Both of the original guard statements would be superfluous here (notably,
“swapAt” is documented to have no effect when i and j are the same) so I
removed them.

Technically we could get away without random access and still have an O(n)
algorithm, at the cost of copying the indices to an array:

public extension MutableCollection {
    mutating func shuffle() {
        guard !isEmpty else { return }

        var idx = Array(indices)

        for i in 0 ..< idx.count - 1 {
            let j = i + random(idx.count - i)
            swapAt(idx[i], idx[j])
            idx.swapAt(i, j)
        }
    }
}

In any event, just in case someone was using a cut-and-paste version of the
example code in their own project, now you know it needs adjustment.

Nevin

Would you like to post a PR to fix these issues?

- Daniel

···

On Dec 16, 2017, at 4:34 PM, Nevin Brackett-Rozinsky via swift-users <swift-users@swift.org> wrote:

The example implementation of the Fisher-Yates shuffle found here <https://github.com/apple/example-package-fisheryates/blob/master/Sources/FisherYates/Fisher-Yates_Shuffle.swift> on Apple’s GitHub (and linked from swift.org/package-manager <https://swift.org/package-manager/>) has some problems. Stripping it down to just the Swift 4 version, the code is:

public extension MutableCollection where Index == Int, IndexDistance == Int {
    mutating func shuffle() {
        guard count > 1 else { return }

        for i in 0..<count - 1 {
            let j = random(count - i) + i
            guard i != j else { continue }
            swapAt(i, j)
        }
    }
}

The main issues are:

1) It assumes that indices are 0-based.
2) It assumes that indices can be randomly accessed by addition.

The former means it does not work for ArraySlice, and the latter means it won’t work for all custom types. Additionally, the “count” property (which is only guaranteed to be O(n) here) is accessed inside the loop, thus making the algorithm potentially O(n²).

To fix everything, we’ll want RandomAccessCollection conformance. Once we add that, we no longer need “Index == Int”. The result looks like:

public extension MutableCollection where Self: RandomAccessCollection, IndexDistance == Int {
    mutating func shuffle() {
        for n in 0 ..< count-1 {
            let i = index(startIndex, offsetBy: n)
            let j = index(i, offsetBy: random(count-n))
            swapAt(i, j)
        }
    }
}

Both of the original guard statements would be superfluous here (notably, “swapAt” is documented to have no effect when i and j are the same) so I removed them.

Technically we could get away without random access and still have an O(n) algorithm, at the cost of copying the indices to an array:

public extension MutableCollection {
    mutating func shuffle() {
        guard !isEmpty else { return }
        
        var idx = Array(indices)
        
        for i in 0 ..< idx.count - 1 {
            let j = i + random(idx.count - i)
            swapAt(idx[i], idx[j])
            idx.swapAt(i, j)
        }
    }
}

In any event, just in case someone was using a cut-and-paste version of the example code in their own project, now you know it needs adjustment.

Nevin
_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users

Saagar Jha

The example implementation of the Fisher-Yates shuffle found here <https://github.com/apple/example-package-fisheryates/blob/master/Sources/FisherYates/Fisher-Yates_Shuffle.swift> on Apple’s GitHub (and linked from swift.org/package-manager <https://swift.org/package-manager/>) has some problems. Stripping it down to just the Swift 4 version, the code is:

public extension MutableCollection where Index == Int, IndexDistance == Int {
    mutating func shuffle() {
        guard count > 1 else { return }

        for i in 0..<count - 1 {
            let j = random(count - i) + i
            guard i != j else { continue }
            swapAt(i, j)
        }
    }
}

The main issues are:

1) It assumes that indices are 0-based.
2) It assumes that indices can be randomly accessed by addition.

The former means it does not work for ArraySlice, and the latter means it won’t work for all custom types. Additionally, the “count” property (which is only guaranteed to be O(n) here) is accessed inside the loop, thus making the algorithm potentially O(n²).

To fix everything, we’ll want RandomAccessCollection conformance. Once we add that, we no longer need “Index == Int”. The result looks like:

public extension MutableCollection where Self: RandomAccessCollection, IndexDistance == Int {
    mutating func shuffle() {
        for n in 0 ..< count-1 {
            let i = index(startIndex, offsetBy: n)
            let j = index(i, offsetBy: random(count-n))
            swapAt(i, j)
        }
    }
}

Both of the original guard statements would be superfluous here (notably, “swapAt” is documented to have no effect when i and j are the same) so I removed them.

Actually, I believe the first guard is still necessary–if the collection is empty, you’d end up with trying to construct the range 0..<-1, which traps. Alternatively, stride(from:to) is more lenient.

···

On Dec 16, 2017, at 16:34, Nevin Brackett-Rozinsky via swift-users <swift-users@swift.org> wrote:

Technically we could get away without random access and still have an O(n) algorithm, at the cost of copying the indices to an array:

public extension MutableCollection {
    mutating func shuffle() {
        guard !isEmpty else { return }
        
        var idx = Array(indices)
        
        for i in 0 ..< idx.count - 1 {
            let j = i + random(idx.count - i)
            swapAt(idx[i], idx[j])
            idx.swapAt(i, j)
        }
    }
}

In any event, just in case someone was using a cut-and-paste version of the example code in their own project, now you know it needs adjustment.

Nevin
_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users

While we’re still looking at the Fisher-Yates shuffle package, I stumbled upon this <https://github.com/apple/example-package-fisheryates/blob/master/Sources/FisherYates/random.swift#L18> definition of random():

public let random: (Int) -> Int = {
     while true {
        let x = Glibc.random() % $0
        let y = Glibc.random() % $0
        guard x == y else { return x }
    }
}

Perhaps I’m mistaken here, but if this is random(3) <https://developer.apple.com/legacy/library/documentation/Darwin/Reference/ManPages/man3/random.3.html> don’t we have to 1. seed it before use and 2. deal with modulo bias? I’m also not quite sure why the guard is there, either.

Saagar Jha

···

On Dec 16, 2017, at 16:37, Daniel Dunbar via swift-users <swift-users@swift.org> wrote:

Would you like to post a PR to fix these issues?

- Daniel

On Dec 16, 2017, at 4:34 PM, Nevin Brackett-Rozinsky via swift-users <swift-users@swift.org <mailto:swift-users@swift.org>> wrote:

The example implementation of the Fisher-Yates shuffle found here <https://github.com/apple/example-package-fisheryates/blob/master/Sources/FisherYates/Fisher-Yates_Shuffle.swift> on Apple’s GitHub (and linked from swift.org/package-manager <https://swift.org/package-manager/>) has some problems. Stripping it down to just the Swift 4 version, the code is:

public extension MutableCollection where Index == Int, IndexDistance == Int {
    mutating func shuffle() {
        guard count > 1 else { return }

        for i in 0..<count - 1 {
            let j = random(count - i) + i
            guard i != j else { continue }
            swapAt(i, j)
        }
    }
}

The main issues are:

1) It assumes that indices are 0-based.
2) It assumes that indices can be randomly accessed by addition.

The former means it does not work for ArraySlice, and the latter means it won’t work for all custom types. Additionally, the “count” property (which is only guaranteed to be O(n) here) is accessed inside the loop, thus making the algorithm potentially O(n²).

To fix everything, we’ll want RandomAccessCollection conformance. Once we add that, we no longer need “Index == Int”. The result looks like:

public extension MutableCollection where Self: RandomAccessCollection, IndexDistance == Int {
    mutating func shuffle() {
        for n in 0 ..< count-1 {
            let i = index(startIndex, offsetBy: n)
            let j = index(i, offsetBy: random(count-n))
            swapAt(i, j)
        }
    }
}

Both of the original guard statements would be superfluous here (notably, “swapAt” is documented to have no effect when i and j are the same) so I removed them.

Technically we could get away without random access and still have an O(n) algorithm, at the cost of copying the indices to an array:

public extension MutableCollection {
    mutating func shuffle() {
        guard !isEmpty else { return }
        
        var idx = Array(indices)
        
        for i in 0 ..< idx.count - 1 {
            let j = i + random(idx.count - i)
            swapAt(idx[i], idx[j])
            idx.swapAt(i, j)
        }
    }
}

In any event, just in case someone was using a cut-and-paste version of the example code in their own project, now you know it needs adjustment.

Nevin
_______________________________________________
swift-users mailing list
swift-users@swift.org <mailto:swift-users@swift.org>
https://lists.swift.org/mailman/listinfo/swift-users

_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users

Would you like to post a PR to fix these issues?

- Daniel

All right, I’ve submitted a PR that I think should work, though I’m not too
confident in the pre-Swift-4 parts (credit to swiftdoc.org if the code for
old versions is valid).

···

On Sat, Dec 16, 2017 at 7:37 PM, Daniel Dunbar <daniel_dunbar@apple.com> wrote:

On Sat, Dec 16, 2017 at 8:03 PM, Saagar Jha <saagar@saagarjha.com> wrote:

Both of the original guard statements would be superfluous here (notably,
“swapAt” is documented to have no effect when i and j are the same) so I
removed them.

Actually, I believe the first guard is still necessary–if the collection
is empty, you’d end up with trying to construct the range 0..<-1, which
traps. Alternatively, stride(from:to) is more lenient.

Right you are, not sure how I missed that. Muphry’s law
<https://en.wikipedia.org/wiki/Muphry's_law> strikes again!

Nevin

IndexDistance == Int is an over-constraint, FWIW. Adding it is generally a mistake. Not a serious one, but it does limit utility somewhat.

HTH,
Dave

···

On Dec 16, 2017, at 4:34 PM, Nevin Brackett-Rozinsky via swift-users <swift-users@swift.org> wrote:

public extension MutableCollection where Self: RandomAccessCollection, IndexDistance == Int {

You are correct, however there is an accepted-and-implemented proposal (
SE–0191
<https://github.com/apple/swift-evolution/blob/master/proposals/0191-eliminate-indexdistance.md>)
to eliminate IndexDistance and replace it with Int, so the constraint will
always be satisfied starting in Swift 4.1.

I wrote a version of shuffle() without that constraint, but it is less
elegant: the line “for n in 0 ..< count - 1” no longer compiles, and
writing “0 as IndexDistance” doesn’t fix it because the resulting Range is
not a Sequence, since IndexDistance.Stride does not conform to
SignedInteger (at least in Swift 4.0 according to Xcode 9.2).

A while loop works of course, though a direct conversion requires writing
“var n = 0 as IndexDistance” before the loop. Luckily, with a bit of
cleverness we can eliminate all mention of IndexDistance, and this time we
really and truly don’t need the “guard !isEmpty” line:

extension MutableCollection where Self: RandomAccessCollection {
    mutating func shuffle() {
        var n = count
        while n > 1 {
            let i = index(startIndex, offsetBy: count - n)
            let j = index(i, offsetBy: random(n))
            n -= 1
            swapAt(i, j)
        }
    }
}

Essentially, the constraint is there to make the code nicer on pre-4.1
versions of Swift, though I’m happy to remove it if you think that’s
better. If nothing else, removing the constraint means people reading the
example code in the future won’t be misled into thinking they need to use
it themselves, so perhaps it should go just on that account.

Okay, you’ve convinced me, I’ll update the PR. :slight_smile:

Nevin

···

On Sun, Dec 17, 2017 at 7:37 PM, Dave Abrahams <dabrahams@apple.com> wrote:

On Dec 16, 2017, at 4:34 PM, Nevin Brackett-Rozinsky via swift-users < > swift-users@swift.org> wrote:

public extension MutableCollection where Self: RandomAccessCollection,
IndexDistance == Int {

IndexDistance == Int is an over-constraint, FWIW. Adding it is generally
a mistake. Not a serious one, but it does limit utility somewhat.

HTH,
Dave

…well that was more complicated than I anticipated.

The “random” function was (Int) -> Int, so when I removed the
“IndexDistance == Int” constraint on “shuffle” I either had to add several
“numericCast” calls or make “random” generic.

So I made “random” generic. And also fixed the modulo bias issue in the
Glibc version that Saagar mentioned. Or at least I hope I did. I haven’t
tested that part (nor any of the non-Swift-4 / non-macOS parts). Also, I am
not sure why “random” had been a closure value stored in a “let” constant,
but I changed it to a function.

While I was at it, I removed the random-access constraint I had placed on
“shuffle”. It will now shuffle any MutableCollection, with complexity O(n)
for a RandomAccessCollection and O(n²) otherwise. (The constraint was
different in different Swift versions, so the entire extension had to be
duplicated. Without the constraint the implementation is much sleeker.)

Nevin

···

On Sun, Dec 17, 2017 at 9:26 PM, Nevin Brackett-Rozinsky < nevin.brackettrozinsky@gmail.com> wrote:

On Sun, Dec 17, 2017 at 7:37 PM, Dave Abrahams <dabrahams@apple.com> > wrote:

On Dec 16, 2017, at 4:34 PM, Nevin Brackett-Rozinsky via swift-users < >> swift-users@swift.org> wrote:

public extension MutableCollection where Self: RandomAccessCollection,
IndexDistance == Int {

IndexDistance == Int is an over-constraint, FWIW. Adding it is generally
a mistake. Not a serious one, but it does limit utility somewhat.

HTH,
Dave

You are correct, however there is an accepted-and-implemented proposal (
SE–0191
<https://github.com/apple/swift-evolution/blob/master/proposals/0191-eliminate-indexdistance.md>)
to eliminate IndexDistance and replace it with Int, so the constraint will
always be satisfied starting in Swift 4.1.

I wrote a version of shuffle() without that constraint, but it is less
elegant: the line “for n in 0 ..< count - 1” no longer compiles, and
writing “0 as IndexDistance” doesn’t fix it because the resulting Range is
not a Sequence, since IndexDistance.Stride does not conform to
SignedInteger (at least in Swift 4.0 according to Xcode 9.2).

A while loop works of course, though a direct conversion requires writing
“var n = 0 as IndexDistance” before the loop. Luckily, with a bit of
cleverness we can eliminate all mention of IndexDistance, and this time we
really and truly don’t need the “guard !isEmpty” line:

extension MutableCollection where Self: RandomAccessCollection {
    mutating func shuffle() {
        var n = count
        while n > 1 {
            let i = index(startIndex, offsetBy: count - n)
            let j = index(i, offsetBy: random(n))
            n -= 1
            swapAt(i, j)
        }
    }
}

Essentially, the constraint is there to make the code nicer on pre-4.1
versions of Swift, though I’m happy to remove it if you think that’s
better. If nothing else, removing the constraint means people reading the
example code in the future won’t be misled into thinking they need to use
it themselves, so perhaps it should go just on that account.

Okay, you’ve convinced me, I’ll update the PR. :slight_smile:

Nevin

Saagar Jha

…well that was more complicated than I anticipated.

The “random” function was (Int) -> Int, so when I removed the “IndexDistance == Int” constraint on “shuffle” I either had to add several “numericCast” calls or make “random” generic.

So I made “random” generic. And also fixed the modulo bias issue in the Glibc version that Saagar mentioned. Or at least I hope I did. I haven’t tested that part (nor any of the non-Swift-4 / non-macOS parts). Also, I am not sure why “random” had been a closure value stored in a “let” constant, but I changed it to a function.

Great! I'll close my pull request, then.

···

On Dec 17, 2017, at 22:24, Nevin Brackett-Rozinsky via swift-users <swift-users@swift.org> wrote:

While I was at it, I removed the random-access constraint I had placed on “shuffle”. It will now shuffle any MutableCollection, with complexity O(n) for a RandomAccessCollection and O(n²) otherwise. (The constraint was different in different Swift versions, so the entire extension had to be duplicated. Without the constraint the implementation is much sleeker.)

Nevin

On Sun, Dec 17, 2017 at 9:26 PM, Nevin Brackett-Rozinsky <nevin.brackettrozinsky@gmail.com <mailto:nevin.brackettrozinsky@gmail.com>> wrote:
On Sun, Dec 17, 2017 at 7:37 PM, Dave Abrahams <dabrahams@apple.com <mailto:dabrahams@apple.com>> wrote:

On Dec 16, 2017, at 4:34 PM, Nevin Brackett-Rozinsky via swift-users <swift-users@swift.org <mailto:swift-users@swift.org>> wrote:

public extension MutableCollection where Self: RandomAccessCollection, IndexDistance == Int {

IndexDistance == Int is an over-constraint, FWIW. Adding it is generally a mistake. Not a serious one, but it does limit utility somewhat.

HTH,
Dave

You are correct, however there is an accepted-and-implemented proposal (SE–0191 <https://github.com/apple/swift-evolution/blob/master/proposals/0191-eliminate-indexdistance.md>) to eliminate IndexDistance and replace it with Int, so the constraint will always be satisfied starting in Swift 4.1.

I wrote a version of shuffle() without that constraint, but it is less elegant: the line “for n in 0 ..< count - 1” no longer compiles, and writing “0 as IndexDistance” doesn’t fix it because the resulting Range is not a Sequence, since IndexDistance.Stride does not conform to SignedInteger (at least in Swift 4.0 according to Xcode 9.2).

A while loop works of course, though a direct conversion requires writing “var n = 0 as IndexDistance” before the loop. Luckily, with a bit of cleverness we can eliminate all mention of IndexDistance, and this time we really and truly don’t need the “guard !isEmpty” line:

extension MutableCollection where Self: RandomAccessCollection {
    mutating func shuffle() {
        var n = count
        while n > 1 {
            let i = index(startIndex, offsetBy: count - n)
            let j = index(i, offsetBy: random(n))
            n -= 1
            swapAt(i, j)
        }
    }
}

Essentially, the constraint is there to make the code nicer on pre-4.1 versions of Swift, though I’m happy to remove it if you think that’s better. If nothing else, removing the constraint means people reading the example code in the future won’t be misled into thinking they need to use it themselves, so perhaps it should go just on that account.

Okay, you’ve convinced me, I’ll update the PR. :slight_smile:

Nevin

_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users

Thank you for working on this!

···

On Sun, Dec 17, 2017 at 11:07 PM, Saagar Jha via swift-users < swift-users@swift.org> wrote:

Saagar Jha

On Dec 17, 2017, at 22:24, Nevin Brackett-Rozinsky via swift-users < > swift-users@swift.org> wrote:

…well that was more complicated than I anticipated.

The “random” function was (Int) -> Int, so when I removed the
“IndexDistance == Int” constraint on “shuffle” I either had to add several
“numericCast” calls or make “random” generic.

So I made “random” generic. And also fixed the modulo bias issue in the
Glibc version that Saagar mentioned. Or at least I hope I did. I haven’t
tested that part (nor any of the non-Swift-4 / non-macOS parts). Also, I am
not sure why “random” had been a closure value stored in a “let” constant,
but I changed it to a function.

Great! I'll close my pull request, then.

While I was at it, I removed the random-access constraint I had placed on
“shuffle”. It will now shuffle any MutableCollection, with complexity O(n)
for a RandomAccessCollection and O(n²) otherwise. (The constraint was
different in different Swift versions, so the entire extension had to be
duplicated. Without the constraint the implementation is much sleeker.)

Nevin

On Sun, Dec 17, 2017 at 9:26 PM, Nevin Brackett-Rozinsky < > nevin.brackettrozinsky@gmail.com> wrote:

On Sun, Dec 17, 2017 at 7:37 PM, Dave Abrahams <dabrahams@apple.com> >> wrote:

On Dec 16, 2017, at 4:34 PM, Nevin Brackett-Rozinsky via swift-users < >>> swift-users@swift.org> wrote:

public extension MutableCollection where Self: RandomAccessCollection,
IndexDistance == Int {

IndexDistance == Int is an over-constraint, FWIW. Adding it is
generally a mistake. Not a serious one, but it does limit utility somewhat.

HTH,
Dave

You are correct, however there is an accepted-and-implemented proposal (
SE–0191
<https://github.com/apple/swift-evolution/blob/master/proposals/0191-eliminate-indexdistance.md>)
to eliminate IndexDistance and replace it with Int, so the constraint will
always be satisfied starting in Swift 4.1.

I wrote a version of shuffle() without that constraint, but it is less
elegant: the line “for n in 0 ..< count - 1” no longer compiles, and
writing “0 as IndexDistance” doesn’t fix it because the resulting Range is
not a Sequence, since IndexDistance.Stride does not conform to
SignedInteger (at least in Swift 4.0 according to Xcode 9.2).

A while loop works of course, though a direct conversion requires writing
“var n = 0 as IndexDistance” before the loop. Luckily, with a bit of
cleverness we can eliminate all mention of IndexDistance, and this time we
really and truly don’t need the “guard !isEmpty” line:

extension MutableCollection where Self: RandomAccessCollection {
    mutating func shuffle() {
        var n = count
        while n > 1 {
            let i = index(startIndex, offsetBy: count - n)
            let j = index(i, offsetBy: random(n))
            n -= 1
            swapAt(i, j)
        }
    }
}

Essentially, the constraint is there to make the code nicer on pre-4.1
versions of Swift, though I’m happy to remove it if you think that’s
better. If nothing else, removing the constraint means people reading the
example code in the future won’t be misled into thinking they need to use
it themselves, so perhaps it should go just on that account.

Okay, you’ve convinced me, I’ll update the PR. :slight_smile:

Nevin

_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users

_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users