You're on the right track! Your last hash(into:)
implementation is a great way to hash components whose order doesn't matter for equality.
extension UnorderedPair: Equatable where T: Equatable {
static func == (lhs: UnorderedPair, rhs: UnorderedPair) -> Bool {
return
(lhs.a == rhs.a && lhs.b == rhs.b) ||
(lhs.a == rhs.b && lhs.b == rhs.a)
}
}
extension UnorderedPair: Hashable where T: Hashable {
func hash(into hasher: inout Hasher) {
hasher.combine(a.hashValue ^ b.hashValue)
}
}
Creating and immediately discarding a Set
works great, but as you've seen, it has allocation overhead.
The reason hashValue
comes out first in your hashing benchmarks is that you aren't measuring the same amount of work: the hashValue
case doesn't include the postprocessing Set
and Dictionary
has to do on the returned value to guarantee good dispersion. Set
and Dictionary
never uses the hashValue
as is to find the correct bucket for elements -- they always need to run the value through a hasher to reduce the chance of performance regressions due to hash value clustering:
https://github.com/apple/swift/blob/swift-4.2-branch/stdlib/public/core/Dictionary.swift#L2560-L2566
As a general rule, conforming to Hashable
by implementing hashValue
is never faster than implementing hash(into:)
and feeding the same value to it. (It is, however, often slower.)
A benchmark that would capture real-life hashing performance is to measure the time it takes to insert values of your type into a Set
. In the case of UnorderedPair
, the two implementations come out as exactly the same in this benchmark.
There is no reason whatsoever to ever implement hashValue
in code targeting Swift 4.2 or above.