How about detection and iteration of permutations

Continuing on the C++ Standard Library tour....

I have a new Gist with copies of the various permutation functions.

I hope I have all the complexity time (and/or space) requirements right.

IsPermutation.swift

The main method, Sequence.isPermutation(of: by:), is an adaptation of the C++ standard function std::is_permutation, including the quadratic runtime. There is a compact isPermutation(of:) method for Equatable elements.

There is also a version for Hashable elements, but that version works much better, approaching linear time, based on some blogs I read. Since a lot of types can be hashed, this version of the method probably will be used the most. (Then again, that will end up surprising some users when their usually linear calls to is_permutation suddenly be quadratic because their latest element type is not Hashable.)

AdjacentPermutation.swift

The main methods, MutableCollection.permuteForward(by:) and permuteBackward(by:), are adaptations of std::next_permutation and std::prev_permutation. The receiving collection must allow bi-directional traversal. There are compact variants for Comparable elements.

It is likely that a user wants to visit every potential permutation. If so, instead of stepping through with the methods above, just make a custom Sequence that does it all at once. Said sequence can be created with the MutableCollection.permutations(by:) method. The collection still needs to be bi-directional, and the method has a variant for Comparable elements. The hardest part of making the DistinctPermutationSequence was figuring out how many elements there will be. (It isn't count! if there are duplicates, and the permute... methods skip duplicates.)

Copy.swift

This file contains the MutableCollection.copy(from:) method, which transfers elements from another collection, but overwrites instead of inserts. Right here, it's listed as internal, because it's off-brand from the point of this library. However, it does match the C++ std::copy function; and std::copy_if, std::copy_n, std::fill, and std::fill_n with proper Swift chaining; so it could be moved to being public in a shared or separate proposal.

Integer.swift

Contains internal methods to compute the binomial and multinomial coefficients, needed to calculate permutation (with repeats) counts. It actually isn't easy to compute these values; humans can eyeball where to cancel denominator factors, but computers can't do that without either a table of prime factorizations or multiple greatest-common-divisor passes. I doubt we should add these to the public math library. (Maybe to our auxiliary numeric library?)

To-do: actually upload the files.
Edit: added the link (up top).


2019-Dec-8 Update

I updated the Gist to revision 2. It makes the permutation sequence type a lazy sequence, and moves its accessors to LazySequenceProtocol. I added an eager variant, including one for Comparable elements. I did make a permutation count method, but kept it private for now.

Would this sequence be eager, or lazy? If it's eager, then is there a reason for it to be a custom sequence, instead of an array? And if it's lazy, shouldn't it be made with foo.lazy.permutations(by: <) like with map, filter, etc, and the eager version returning an Array?

2 Likes

Right now, it's lazy. I could make an eager version. Note that my trial collection:

[1, 2, 4, 8, 1, 3, 9, 27, 1, 4, 16, 64, 1, 5, 25, 125]

has 16 elements, with 1 repeated 4 times and 4 repeated twice. Naïvely, that's 16!, or nearly 21 trillion, permutations. The permutation sequence filters out repeats, so it's only nearly a half-trillion permutations. That's why I didn't even think about an array-returning version at first. Maybe I'll add a checking function to help.

2 Likes

I updated the Gist to revision 2. There is now an eager variant.

Not only should permutation generation be lazy by default, but an eager variant shouldn't be provided at all—such an algorithm has factorial time complexity and returns a collection of factorial size, which would be a huge footgun for all but the tiniest data sets. If a user asks for an array of permutations, there are two answers:

  1. You don't want that. Do you really want that?
  2. Fine, then pass the lazy sequence to the Array(Sequence) initializer, which acknowledges the potential harm of that operation.

In-place permuting and lazy sequence generation is fine, but if this is a pitch for eventual addition to the standard library, the stdlib shouldn't be offering users a convenient factorial operation.

3 Likes

That’s the way I originally had it, but the new eager variant allows a throwing predicate.

I was uncomfortable moving my first version to the lazy interface, because I’m not sure that the BidirectionalCollection and/or MutableCollection will always carry over to the lazy wrapper. (It does if the main type passes itself as Elements.) Maybe use lazilyGeneratedPermutations as the method name? Or just accept permutations as a slightly inaccurate name (but mention the laziness in the docs).