[Pitch] Keypath-based sort/min/max Sequence operations

SE-0249 introduced the ability to use keypaths for a certain method signature interchangably with functions with the same signature as function arguments. So, for instance, the line users.map( { $0.email }) can be interchanged with the line users.map(\.email).

What I would like to propose is that, through no change to the language itself, but only some additions to the standard library, to allow keypaths to be used in certain Sequence methods that accept a closure with two parameters of type Sequence.Element, such as (Element, Element) -> Bool as used by Sequence.sort, Sequence.min and Sequence.max.

Such a method would allow for briefer code with less repetition for most uses of sort/sorted, min and max, which are usually written as sorted { $0.count < $1.count } or max { $0.weight < $1.weight }, accessing the same property on both the left hand and right hand side of the comparison operator.

Using keypaths instead, the line sorted { $0.count < $1.count } could instead be written as sorted(by: \.count), with the sorted(by keyPath: KeyPath<Element>) -> [Element] method calling the current sorted(by areInIncreasingOrder: (Element, Element) -> Bool) -> [Element] method. An optional extra parameter could supply a custom areInIncreasingOrder function for comparing elements.

A simplistic implementation of sort, min and max using keypaths could look like this:

extension Sequence {
    /// Returns the elements of the sequence, sorted by a (`Comparable`) property indicated by the keypath, optionally using the given predicate as the comparison between elements.
    func sorted<T: Comparable>(by keyPath: KeyPath<Self.Element, T>, using sortingFunction: (T, T) -> Bool = (<)) -> [Self.Element] {
        self.sorted(by: { sortingFunction($0[keyPath: keyPath], $1[keyPath: keyPath]) })
    }
    
    /// Returns the maximum element in the sequence, using the given predicate as the comparison between elements.
    func max<T: Comparable>(by keyPath: KeyPath<Self.Element, T>, using comparingFunction: (T, T) -> Bool = (<)) -> Self.Element? {
        self.max(by: { comparingFunction($0[keyPath: keyPath], $1[keyPath: keyPath]) })
    }
    
    /// Returns the minimum element in the sequence, using the given predicate as the comparison between elements.
    func min<T: Comparable>(by keyPath: KeyPath<Self.Element, T>, using comparingFunction: (T, T) -> Bool = (<)) -> Self.Element? {
        self.min(by: { comparingFunction($0[keyPath: keyPath], $1[keyPath: keyPath]) })
    }
}

This is primarily a quality-of-life pitch, but I think it fits in with the pattern established by SE-0249 and would allow for cleanlier map/flatMap/reduce pipelines by reducing repetition and a certain amount of visual clutter, just as map(\Root.keyPath) has done.

EDIT:
As @ole has pointed out, similar pitches have been made before, notably by @cal in Sort `Collection` using `KeyPath` (which was pitched before using keypaths in place of closures for higher-order functions made it into the language).
In a follow-up comment posted three years later (after the above change came into effect), @olbo points out that the interchangability of keypaths and closures means that the function can be generalised to accept either form. This means that the implementation of sorted mentioned above can be changed from:

    func sorted<T: Comparable>(by keyPath: KeyPath<Self.Element, T>, using sortingFunction: (T, T) -> Bool = (<)) -> [Self.Element] {
        self.sorted(by: { sortingFunction($0[keyPath: keyPath], $1[keyPath: keyPath]) })
    }

into:

    func sorted<T: Comparable>(by keyPath: (Self.Element) -> T, using sortingFunction: (T, T) -> Bool = (<)) -> [Self.Element] {
        self.sorted(by: { sortingFunction(keyPath($0), keyPath($1)) })
    }

@lovee has also pitched a form like this for min/max in Sort(by:) min(by:) max(by:) with keyPaths where he mentions two potential forms for calling the max function:
let max = sequence.max(by: \.property) or
let max = sequence.max(by: { $0.getSomeValue(from: anotherData) }).

Finally, @anthonylatsis proposed Map Sorting in 2019, with the added twist of an isExpensiveTransform parameter to tell the sorting algorithm to cache compared values in case they may be expensive to fetch, as in the let max = sequence.max(by: { $0.getSomeValue(from: anotherData) }) example above where getSomeValue may be called repeatedly during the sort. @cal also contributed to that thread with some statistics from the Swift Compatibility Suite regarding typical use cases of sort and sorted, and what optional comparison function was supplied.

That proposal never made it into review, with a follow-up comment:

The discussion has since stalled, and I would appreciate input from other Swift users here.

20 Likes

I would love to see something like this in the standard library. This topic has been discussed several times before. It would be helpful if you could try to search the forums and link to (and even better, summarize) previous discussions.

It's worth noting that Apple added this feature to Foundation in 2021, so it is available on Apple platforms if your deployment target is at least iOS 15/macOS 12. (It’s super unclear to me if there's a plan to bring Darwin Foundation improvements to Corelibs Foundation in a timely manner.)

Example:

import Foundation

struct Person {
  var name: String
  var age: Int
}

let unsorted = [
  Person(name: "Alice", age: 20),
  Person(name: "Bob", age: 21),
  Person(name: "Alice", age: 30),
  Person(name: "Bob", age: 31),
]

let sorted = unsorted.sorted(using: [
  KeyPathComparator(\.name),
  KeyPathComparator(\.age, order: .reverse)
])
// Returns: Alice/30, Alice/20, Bob/31, Bob/21
9 Likes

Thank you, Ole, I hadn't noticed that this had been pitched before even though I try to keep and eye on this subforum.

I stumbled upon the SortComparator and KeyPathComparator last year in the iOS 15 SDK, but it was undocumented then, so I promptly forgot about it.

@idrougge Thanks for adding the summary!

Just an API design suggestion:

Since the existing sort functions use "by" for the predicate/operator (with the sequence's elements being the operands), I think using "by" for the operands in the pitched new functions could lead to confusion, especially in the context of choosing from code completions. What do you think of using the preposition "on", with the sense of "the predicate operates on the operands":

// existing
func sorted(
    by areInIncreasingOrder: (Element, Element) throws -> Bool
) rethrows -> [Element]

// pitched
func sorted<T: Comparable>(
    on operand: (Element) -> T, 
    by areInIncreasingOrder: (T, T) throws -> Bool = (<)
) rethrows -> [Element]
5 Likes

This idea is tracked in swift-algorithms by the following issue:

https://github.com/apple/swift-algorithms/issues/89

4 Likes

Absolutely. Some earlier pitches used exactly that sorted(on:by:) form and it is a much better fit.

+1 on the Pitch.

I've been using this code in my projects for a while. I prefer sorted(by:using:) as the prototype, as in the pitch.

Also sum(), which I think has also been pitched before.

extension Sequence where Element: AdditiveArithmetic {
  public func sum() -> Element {
    reduce(.zero, +)
  }
}

extension Sequence {
  /// let result = list.sum(\.count)
  public func sum<T: AdditiveArithmetic>(_ predicate: (Element) -> T) -> T {
    reduce(.zero) { $0 + predicate($1) }
  }
}
1 Like

= (<) seems like it would work, but it doesn't well enough. Default closures always require try, not respecting rethrows. So you need an overload.

/// Sorted by a common `Comparable` value.
func sorted<Comparable: Swift.Comparable>(
  by comparable: (Element) throws -> Comparable
) rethrows -> [Element] {
  try sorted(by: comparable, <)
}

KeyPathComparator, as the name implies, only works with key paths. We need something that works with closures.

I kind of hate this. It's unwieldy for greater than 2-arity and it doesn't match up visually like KeyPathComparator initializer calls do. Can you do better?

func sorted<Comparable0: Comparable, Comparable1: Comparable>(
  orders: (SortOrder, SortOrder),
  _ comparables: (Element) throws -> (Comparable0, Comparable1)
) rethrows -> [Element] {
  try sorted {
    let comparables = try (comparables($0), comparables($1))
    switch orders {
    case (.forward, .forward):
      return comparables.0 < comparables.1
    case (.forward, .reverse):
      return (comparables.0.0, comparables.1.1) < (comparables.1.0, comparables.0.1)
    case (.reverse, .forward):
      return (comparables.1.0, comparables.0.1) < (comparables.0.0, comparables.1.1)
    case (.reverse, .reverse):
      return (comparables.1.0, comparables.1.1) < (comparables.0.0, comparables.0.1)
    }
  }
}

This is super useful, I often make this extension myself. But it’s not really about keypaths as such, it’s about providing a transform that makes the element comparable. One way to provide that transform is through a KeyPath that points to a property of Element, but that’s a special case, and isn’t it true that it will work automatically if we make a sort that takes (Element) -> Comparable?

1 Like

@idrougge would you be so kind and fix the min function inside your pitch? It should use the < operator instead of > in the default comparingFunction since it uses min instead of max in the body. Asking cause I copy your code every few months and then each time spend a couple of hours debugging and I'm probably not the only one - Google points here for "sequence min by keypath" query.

Unfortunately, I cannot edit the post after so much time. Perhaps an admin can fix it?

It looks like I can edit it. If you post what you want updated here I can try to do it (I don't want to guess even though I think I know, in case I get it wrong and make things worse :)

Done

1 Like

Thank you so much @Ben_Cohen and @idrougge !