Discussion: permutation algorithm

Generating the different permutations for a collection is a generally useful feature found in other standard libraries (e.g. C++, Python, Guava for Java). It's pretty tricky to implement your own generic permutation algorithm, and easy to write one inefficiently or incorrectly, and as such belong in the Swift standard library. There are a few different options to discuss, though.

In-place permute

C++'s std::next_permution provides good inspiration for an in-place mutating permute operation:

/// Reorder elements in-place into the next permutation, returning false when
/// the collection has been permuted into lexicographical order. To cycle
/// through all permutations, start with a lexicographically-sorted collection
/// and permute until it returns false.
///
/// - Complexity: O(*n*), where *n* is the length of the collection.
public extension MutableCollection where Self: BidirectionalCollection {
  public mutating func permute(
    by areInIncreasingOrder: (Element,Element) throws ->Bool
  ) rethrows -> Bool {
}

/// Usual predicate-less version for convenience
public extension MutableCollection
where Self: BidirectionalCollection, Element: Comparable {
    public mutating func permute() -> Bool
}

This API is easy to use once you understand it:

// print out all the permutations of 1,2,3,4
var a = Array(1...4)
repeat {
    print(a)
} while a.permute()

It's also nice because it needs no extra storage. The collection that you're mutating is itself the storage of the current permutation state.

However, it does have a hidden gotcha: you need to remember to sort the collection before you begin:

var a = [2,4,1,3]
// forgot: a.sort()
repeat {
  // unexpectedly only prints out 8 permutations
  print(a)
} while a.permute()

There's not much to be done about this with the low-level API. permute can't check its input is sorted, because of the way it's used.

Which leads to the next part: providing an out-of-place API

Out-of-place permutations

The usual pattern of permute for in-place and permuted for out-of-place doesn't really work here, for two reasons:

  • permute returns a boolean to indicate it's come around to the sorted permutation. permuted would also need to return something. It could return a ([Element],Bool) tuple or an [Element]? value. But both these APIs would be prohibitively inconvenient to use in practice.
  • Unlike other out-of-place APIs like filter or sorted, you rarely want to perform just one permutation. You usually want to cycle through many of them. And there can be a lot! You probably don't want to be newing n! arrays, no matter how much you dislike using var.

Instead it probably makes more sense to generate a semi-lazy Permutations view:

struct Permutations<Elements: Collection>: Collection {
  public typealias Element = LazyMapCollection<[Elements.Index],Elements.Element>
  public struct Index: Comparable { }
  public func index(after i: Index) -> Index
  public var startIndex: Index
  public var endIndex: Index
  public subscript(i: Index) -> Element
}

public extension Collection {
  public func permutations(
    by areInIncreasingOrder: @escaping (Element,Element)->Bool
  ) -> Permutations<Self>
}

public extension Collection where Element: Comparable {
  public func permutations() -> Permutations<Self>
}

This is a much easier API to use:

for anagram in "astronomer".permutations() {
    if anagram.elementsEqual("moonstarer") {
        print("Literally the only SFW anagram I could find")
    }
}

Unlike permute, this method could validate a precondition that its input was already sorted on construction. Internally, it permutes an array of indices and then uses that to lazily map the base collection. This approach isn't too bad performance-wise. Benchmarks on a prototype (that could probably be optimized) suggest it's about 4x edit: no, make that 20x slower than the in-place variant, with a sequence of 8 ints, assuming you access every element of every permutation (because accessing elements goes via the lazy map).

edit: turns out there was a bug in my in-place algorithm: in-place wins by a long way, the lazy mapping back is pretty expensive

Giving up lexicographic ordering

The above assumes that you want lexicographic ordering of each permutation. If you're willing to allocate storage (which we're doing anyway in the case of the out-of-place version), this is a bit strict and could be loosened. Then, you could permute values that didn't have their own ordering (you use the positional ordering instead), and a different more efficient permutation algorithm (like Heap's algorithm) could be used.

For in-place mutation (that is, copying the elements into a new type that you then permute):

struct Permuter<Element> {
  init<S: Sequence>(_ base: S) where S.Element == Element
  // array containing the current permutation
  var currentPermutation: [Element]
  // move `currentPermutation` on to the next permutation,
  // returning false when this was the last one
  mutating func permute() -> Bool
}

edit: oops, bug in my benchmark! nowhere need the gains from the better algorithm for trivial types – it's still a win if your comparison op is expensive tho:

The allocation is still way more expensive than the gains for trivial types like Int. Heap's algorithm is about 3x as fast for String, even including the extra allocation, to run through all permutations of 8 elements, because it only compares integers to do the permutation, not the strings.

You don't see much benefit from out-of-place version because you still have to do the mapping from the permuted indexes to the collection elements – it's still the lazy mapping that takes most of the time.

These come at the cost of the permutations not being in an intuitive order, unlike the lexicographic ordering. Worse, they would sometimes seem to be in lexicographic order, but not always. It's also not clear if these performance benefits are worthwhile – I am guessing they'd usually be outweighed by the processing actually being done on each of the permutations.

Permutations of subsequences

Python's itertools.permutations takes an optional length r to restrict the size of the permutations i.e.

permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC

This is a useful feature, but could probably be added as a second phase.

6 Likes

What are use cases that require lexicographical ordering? Are those likely to outnumber or be outnumbered by use cases where lexicographical ordering does not matter, in which cases the user could be tripped up by an unnecessary precondition?

An alternative to the lazy map out-of-place would be an in-place permute() as in your first option, along with a Sequence/Iterator that just wraps up the in-place permute:

struct Permuter<T> : Sequence, IteratorProtocol {
  typealias Iterator = Permuter
  typealias Element = [T]

  var array: [T]
  let areInIncreasingOrder: (T, T) -> Bool

  init(array: [T], by areInIncreasingOrder: @escaping (T, T) -> Bool) {
    self.array = array
    self.areInIncreasingOrder = areInIncreasingOrder
    self.array.sort(by: areInIncreasingOrder)
  }

  mutating func next() -> Element? {
    return array.permute(by: areInIncreasingOrder) ? array : nil
  }
}

extension Collection where Element: Comparable {
  func permutations() -> Permuter<Element> {
    return Permuter(array: Array(self), by: <)
  }
}

This would provide the easy permutations() API at the cost of an extra array of space (and collections that you go through the permutations of can't be that big), rather than the lazy mapping cost.

1 Like

If the Permuter construction is the last use of the array you give it, then it should be able to take ownership of it, so this would at least theoretically be able to avoid copying the array. If you wanted to be able to get the array back out, you could throw a __consuming accessor on there too:

extension Permuter {
  __consuming func takeArray() -> [T] { return array }
}

or have a with block form that uses the array as permutation storage for a scoped duration:

withPermuter(for: &array) { permuter in ... }

+1

I have written by own extensions to generate permutations and combinations, and always meant to revisit the notion of creating a generator to do the same, as the space required to generate all permutations or combinations grows exponentially.

I would suggest one thing. Also support the ability to generate combinations, as well as permutations.