Hey everyone, I'd like to continue a discussion about sorting ergonomics that I pitched in July. (This is a refined iteration of that pitch, based on great feedback from the original thread).
Today, to sort a Collection
of non- Comparable
elements, we write a closure that returns whether or not two elements areInIncreasingOrder
:
struct Person {
let name: String
let age: Int
}
let people = [
Person(name: "Bob", age: 34),
Person(name: "Alice", age: 34),
Person(name: "Eve", age: 30)]
people.sorted(by: { $0.age < $1.age })
// returns [Eve (30), Bob (34), Alice, (34)]
This example has a few features that you could consider "pain points":
- The call site doesn't form a grammatical english phrase. [1]
- It's relatively easy to mix up
<
and>
, and it isn't immediately obvious from the call site which operator results in which ordering. - Repeating the property for both
$0
and$1
is a touch verbose.
To specify further ordering (for example, sorting by age and then by name), we must write a more complex closure:
people.sorted(by: { lhs, rhs in
if lhs.age != rhs.age {
return lhs.age < rhs.age
} else {
return lhs.name < lhs.name
}
})
// returns [Eve (30), Alice (34), Bob (34)]
This example also has a few pain points:
- The closure is more "imperative" than "declarative". It has to be read closely to grok exactly what it's doing.
- There are several other ways to write such logic, so there is not one singular canonical version.
- It continues to be rather verbose, with six property accessors, and scales poorly as you add more fallbacks.
Proposed Solution
To improve the ergonomics and clarity of these call-sites, I'd like to propose two new sorting APIs:
Sorting with map
closures
If we had a sort
variant that took a closure similar to map
, we could instead write:
people.sorted(by: { $0.age })
// instead of
people.sorted(by: { $0.age < $1.age })
This API is more declarative, less verbose, and has less room for error. The call-site also reads much more like an actual english phrase.
Sorting with SortDescriptor
A simple map
-like sort doesn't help us in the more complex case where we'd like to use additional properties as fallbacks.
NSSortDescriptor
is a great Foundation API for declaratively describing how to sort a collection. If our Person
type was instead an @objc class
instead of a struct
, we could write:
import Foundation
(people as NSArray).sortedArray(using: [
NSSortDescriptor(keyPath: \Person.name, ascending: true),
NSSortDescriptor(keyPath: \Person.age, ascending: true)
]) as! [Person]
This is declaratively an improvement, but isn't an acceptable solution. It shackles us to the Objective-C runtime, can't be used with struct
s, and is unavailable on non-Apple platforms.
If we had a first-class SortDescriptor
type, we could write:
people.sorted(by: [
SortDescriptor({ $0.age }),
SortDescriptor({ $0.name })])
//instead of
people.sorted(by: { lhs, rhs in
if lhs.age != rhs.age {
return lhs.age < rhs.age
} else {
return lhs.name < lhs.name
}
})
This declarative SortDescriptor
approach is a huge improvement over the imperative approach that is currently available. It's less verbose, less complicated, and would scale better as you added additional fallbacks.
Detailed Design
These improvement would be accomplished by introducing two new sort
variants, and a new SortDescriptor
type:
public extension Collection {
public func sorted<T: Comparable>(by value: (Element) throws -> T) rethrows -> [Element]
public func sorted(by sortDescriptors: [SortDescriptor<Element>]) -> [Element]
}
// an equivalent set of mutating `sort(by:)` methods would also be made available.
public struct SortDescriptor<Element> {
public init<Value: Comparable>(_ value: @escaping (Element) -> Value)
public func orders(_ lhs: Element, before rhs: Element) -> Bool
/// Returns a `SortDescriptor` based on the same property, but with a reversed sort order.
public var reversed: SortDescriptor<Element>
}
A playground with concrete implementation details is available here.
Alternatives considered
Sorting using KeyPath
My original pitch in this area focused on KeyPaths rather than map
closures, and yielded call-sites like:
people.sorted(by: \.age)
people.sorted(by: [
SortDescriptor(\.age),
SortDescriptor(\.name)])
The consensus was that using KeyPaths in this way is unprecedented in the Standard Library. It would be unusual, for example, if we allowed people.sorted(by: \.age)
but not people.map(\.age)
.
My take is that KeyPath<Element, Value>
could become interchangeable with (Element) -> Value
, since they have the same semantic meaning. That could be implemented as either by blessing KeyPath
with compiler magic, or through some sophisticated new API. That would provide implicit KeyPath support for most of the Standard Library, but is far out of scope for this proposal.
Schwartzian transform
In the original pitch thread, several people suggested using the Schwartzian transform as part of the implementation, or perhaps offering it as an option.
I experimented with an implementation that uses the Schwartzian transform, and found these results using this implementation:
Simple Accessor ($0.name)
=========================
Standard Library Sort: 0.0113s
Closure-Based Sort: 0.0154s
Schwartzian Sort: 0.0307s
Complex Calculation ($0.arbitraryComplexCalculation)
=========================
Standard Library Sort: 1.074s
Closure-Based Sort: 1.0514s
Schwartzian Sort: 0.0578s
While the Schwartzian sort performance was much better for a complex computed property, it was three-times worse for a simple accessor.
The Swift API Design Guidelines suggests that we should "Document the complexity of any computed property that is not O(1)" and notes that "people often assume that property access involves no significant computation".
I think that, since O(1) computed properties are the expected default, we should focus on that case as the default. We could potentially include a opt-in parameter on the sort(by:)
map
closure implementation, but I didn't include that in this pitch for a few reasons
- That approach wouldn't translate very well to the
SortDescriptor
implementation. - It isn't clear how to present that option in a reasonable way. The term
schwartzian transform
is jargon and should be avoided. - It isn't the primary focus of this pitch.