Generating the different permutations for a collection is a generally useful feature found in other standard libraries (e.g. C++, Python, Guava for Java). It's pretty tricky to implement your own generic permutation algorithm, and easy to write one inefficiently or incorrectly, and as such belong in the Swift standard library. There are a few different options to discuss, though.
In-place permute
C++'s std::next_permution
provides good inspiration for an in-place mutating permute
operation:
/// Reorder elements in-place into the next permutation, returning false when
/// the collection has been permuted into lexicographical order. To cycle
/// through all permutations, start with a lexicographically-sorted collection
/// and permute until it returns false.
///
/// - Complexity: O(*n*), where *n* is the length of the collection.
public extension MutableCollection where Self: BidirectionalCollection {
public mutating func permute(
by areInIncreasingOrder: (Element,Element) throws ->Bool
) rethrows -> Bool {
}
/// Usual predicate-less version for convenience
public extension MutableCollection
where Self: BidirectionalCollection, Element: Comparable {
public mutating func permute() -> Bool
}
This API is easy to use once you understand it:
// print out all the permutations of 1,2,3,4
var a = Array(1...4)
repeat {
print(a)
} while a.permute()
It's also nice because it needs no extra storage. The collection that you're mutating is itself the storage of the current permutation state.
However, it does have a hidden gotcha: you need to remember to sort the collection before you begin:
var a = [2,4,1,3]
// forgot: a.sort()
repeat {
// unexpectedly only prints out 8 permutations
print(a)
} while a.permute()
There's not much to be done about this with the low-level API. permute
can't check its input is sorted, because of the way it's used.
Which leads to the next part: providing an out-of-place API
Out-of-place permutations
The usual pattern of permute
for in-place and permuted
for out-of-place doesn't really work here, for two reasons:
-
permute
returns a boolean to indicate it's come around to the sorted permutation.permuted
would also need to return something. It could return a([Element],Bool)
tuple or an[Element]?
value. But both these APIs would be prohibitively inconvenient to use in practice. - Unlike other out-of-place APIs like
filter
orsorted
, you rarely want to perform just one permutation. You usually want to cycle through many of them. And there can be a lot! You probably don't want to be newingn!
arrays, no matter how much you dislike usingvar
.
Instead it probably makes more sense to generate a semi-lazy Permutations
view:
struct Permutations<Elements: Collection>: Collection {
public typealias Element = LazyMapCollection<[Elements.Index],Elements.Element>
public struct Index: Comparable { }
public func index(after i: Index) -> Index
public var startIndex: Index
public var endIndex: Index
public subscript(i: Index) -> Element
}
public extension Collection {
public func permutations(
by areInIncreasingOrder: @escaping (Element,Element)->Bool
) -> Permutations<Self>
}
public extension Collection where Element: Comparable {
public func permutations() -> Permutations<Self>
}
This is a much easier API to use:
for anagram in "astronomer".permutations() {
if anagram.elementsEqual("moonstarer") {
print("Literally the only SFW anagram I could find")
}
}
Unlike permute
, this method could validate a precondition that its input was already sorted on construction. Internally, it permutes an array of indices and then uses that to lazily map the base collection. This approach isn't too bad performance-wise. Benchmarks on a prototype (that could probably be optimized) suggest it's about 4x edit: no, make that 20x slower than the in-place variant, with a sequence of 8 ints, assuming you access every element of every permutation (because accessing elements goes via the lazy map).
edit: turns out there was a bug in my in-place algorithm: in-place wins by a long way, the lazy mapping back is pretty expensive
Giving up lexicographic ordering
The above assumes that you want lexicographic ordering of each permutation. If you're willing to allocate storage (which we're doing anyway in the case of the out-of-place version), this is a bit strict and could be loosened. Then, you could permute values that didn't have their own ordering (you use the positional ordering instead), and a different more efficient permutation algorithm (like Heap's algorithm) could be used.
For in-place mutation (that is, copying the elements into a new type that you then permute):
struct Permuter<Element> {
init<S: Sequence>(_ base: S) where S.Element == Element
// array containing the current permutation
var currentPermutation: [Element]
// move `currentPermutation` on to the next permutation,
// returning false when this was the last one
mutating func permute() -> Bool
}
edit: oops, bug in my benchmark! nowhere need the gains from the better algorithm for trivial types – it's still a win if your comparison op is expensive tho:
The allocation is still way more expensive than the gains for trivial types like Int
. Heap's algorithm is about 3x as fast for String
, even including the extra allocation, to run through all permutations of 8 elements, because it only compares integers to do the permutation, not the strings.
You don't see much benefit from out-of-place version because you still have to do the mapping from the permuted indexes to the collection elements – it's still the lazy mapping that takes most of the time.
These come at the cost of the permutations not being in an intuitive order, unlike the lexicographic ordering. Worse, they would sometimes seem to be in lexicographic order, but not always. It's also not clear if these performance benefits are worthwhile – I am guessing they'd usually be outweighed by the processing actually being done on each of the permutations.
Permutations of subsequences
Python's itertools.permutations takes an optional length r
to restrict the size of the permutations i.e.
permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
This is a useful feature, but could probably be added as a second phase.