An important feature of the compiler/stdlib is that it supports code written for previous versions of Swift. Swift 3 and 4 code can live together in the same binary today. Even if we deprecate hashValue
in a future release, we will still have to support compiling code written for previous versions of Swift, without forcing users through a disruptive all-or-nothing migration step.
So we need to fully support compiling code that only implements hashValue
. For a normal protocol, this should be easy enough to do by supplying default implementations, carefully gated to specific Swift versions:
extension Hashable {
@available(swift, introduced: 3.x, obsoleted: y)
public func hash(into hasher: inout Hasher) {
hasher.append(self.hashValue)
}
@available(swift, introduced: 3.x, deprecated: y)
public var hashValue: Int {
var hasher = Hasher()
hash(into: &hasher)
return hasher.finalize()
}
}
So far, so good. But then consider what happens when we need to synthesize Hashable
in Swift 4 mode.
// Compiled with -swift-version 4
struct Foo: Hashable {
let a: Int
let b: String
}
Ideally, the synthesized implementation would look like this:
@derived func hash(into hasher: inout Hasher) {
hasher.append(a)
hasher.append(b)
}
@derived var hashValue: Int {
var hasher = Hasher()
hash(into: &hasher)
return hasher.finalize()
}
There are two problems here:
- The synthesized body of
hashValue
is the same as the default implementation of it in Swift 5 mode; we're duplicating code between the stdlib and the compiler.
- Far, far worse is that in Swift 4 mode,
hash(into:)
already has an implementation in the stdlib, so the compiler wouldn't consider synthesizing it at all. So Hashable
synthesis falls into a recursive hole and doesn't work at all in Swift 4 mode.
This is really unfortunate. I see four ways to get out of this conundrum:
-
Add magic to the compiler to recognize that we're compiling older code, and to do the actual hashing work in the synthesized hashValue
implementation instead:
@derived var hashValue: Int {
var hasher = Hasher()
hasher.append(a)
hasher.append(b)
return hasher.finalize()
}
I find this is an unpalatable solution -- Foo
may be hashed as part of a larger structure, and in that case spawning a new hasher is not nearly as efficient as simply feeding Foo
's components to the existing one. It also means that we need to have two distinct paths for the actual synthesis code (where we need to iterate over components), which is deeply unsatisfying.
-
Add deep type checker magic to the compiler to ignore the default implementation of hash(into:)
in this particular case, so that we forcibly generate a synthesized implementation. I actually implemented a variant of this in #14935. It did happen to work most of the time; however, it was such a shameful, fragile abuse of the compiler's type checker that I quickly retracted it.
-
Remove both default implementations from the stdlib, and move them to the compiler instead. hashValue
is always synthesized as the default implementation above. When we need to synthesize hash(into:)
, we can simply switch between the two alternative implementations depending on whether hashValue
resolves to a synthesized or explicit implementation. This is what I ended up implementing in #15122, and it seems to be the cleanest approach. The compiler can easily produce deprecation warnings for certain combinations of implementations and language modes.
-
Add a third, hidden requirement to Hashable
, called _hash(into:)
, which is always synthesized by the compiler, and either forwards to hashValue
/hash(into:)
(if one of them has an explicit implementation) or implements the synthesis itself. This puts several extra twists on top of solution 3 above; in my opinion, it messes up the ABI for no good reason.
I believe solution 3 is the clear winner here; it provides the most satisfactory solution to both of the original problems. By necessity, it involves some compiler work, because SE-0185 added special language support for Hashable
. But this is just a fact of life: stdlib and compiler work often goes hand in hand. I don't think changing that would be a viable goal here.
If you do have a better solution, help us implement it! The stdlib part is on the master branch today (although underscored), and #15122 would be a nice starting point for the synthesizer update.
All of this is implementation detail that may change without notice and is only tangentially related to the topic at hand. I suspect that in this topic, we should concentrate a little more on what we intend to achieve and a little less on how we're getting there.