[Draft] Automatically deriving Equatable and Hashable for certain value types

From: "L. Mihalkovic" <laurent.mihalkovic@gmail.com>
Date: May 25, 2016 at 10:08:35 PM GMT+2
To: Tony Allevato <allevato@google.com>
Subject: Re: [swift-evolution] [Draft] Automatically deriving Equatable and Hashable for certain value types

There are precedents that show that something like this needs to provide the ability for users to influence the heuristic. Something like this comes to mind (not limited to let/var fields but allowing to mark value returning method/properties)

Struct xxxxxx {
    @strawman_id Var f1
    @strawman_id func getabc() -> Int
    Var nonIdField
}

@strawman_all_id Struct xxxxxx {
    Var f1
    Var f2
    Var f3
}

whereby the compiler can be told what is id versus not. That should eliminate many manual override situations. There is at least 10 years of precedent showing the usefulness and/or pitfall or such schemes.

Problem is I think this is something that can probably be equally done using the upcoming behaviors, or via the further down the roard macros. Either way, I'm not sure it belongs anywhere in the form described.

···

Begin forwarded message:

On May 25, 2016, at 8:28 PM, Tony Allevato via swift-evolution <swift-evolution@swift.org> wrote:

I was inspired to put together a draft proposal based on an older discussion in the Universal Equality, Hashability, and Comparability thread <http://thread.gmane.org/gmane.comp.lang.swift.evolution/8919/&gt; that recently got necromanced (thanks Mark Sands!).

I'm guessing that this would be a significant enough change that it's not possible for the Swift 3 timeline, but it's something that would benefit enough people that I want to make sure the discussion stays alive. If there are enough good feelings about it, I'll move it from my gist into an actual proposal PR.

Automatically deriving Equatable andHashable for value types
Proposal: SE-0000
Author(s): Tony Allevato
Status: Awaiting review
Review manager: TBD
Introduction

Value types are prevalent throughout the Swift language, and we encourage developers to think in those terms when writing their own types. Frequently, developers find themselves writing large amounts of boilerplate code to support equatability and hashability of value types. This proposal offers a way for the compiler to automatically derive conformance toEquatable and Hashable to reduce this boilerplate, in a subset of scenarios where generating the correct implementation is likely to be possible.

Swift-evolution thread: Universal Equatability, Hashability, and Comparability

Motivation

Building robust value types in Swift can involve writing significant boilerplate code to support concepts of hashability and equatability. Equality is pervasive across many value types, and for each one users must implement the == operator such that it performs a fairly rote memberwise equality test. As an example, an equality test for a struct looks fairly uninteresting:

func ==(lhs: Foo, rhs: Foo) -> Bool {
  return lhs.property1 == rhs.property1 &&
         lhs.property2 == rhs.property2 &&
         lhs.property3 == rhs.property3 &&
         ...
}
What's worse is that this operator must be updated if any properties are added, removed, or changed, and since it must be manually written, it's possible to get it wrong, either by omission or typographical error.

Likewise, hashability is necessary when one wishes to store a value type in a Set or use one as a multi-valuedDictionary key. Writing high-quality, well-distributed hash functions is not trivial so developers may not put a great deal of thought into them – especially as the number of properties increases – not realizing that their performance could potentially suffer as a result. And as with equality, writing it manually means there is the potential to get it wrong.

In particular, the code that must be written to implement equality for enums is quite verbose. One such real-world example (source):

func ==(lhs: HandRank, rhs: HandRank) -> Bool {
  switch (lhs, rhs) {
  case (.straightFlush(let lRank, let lSuit), .straightFlush(let rRank , let rSuit)):
    return lRank == rRank && lSuit == rSuit
  case (.fourOfAKind(four: let lFour), .fourOfAKind(four: let rFour)):
    return lFour == rFour
  case (.fullHouse(three: let lThree), .fullHouse(three: let rThree)):
    return lThree == rThree
  case (.flush(let lRank, let lSuit), .flush(let rRank, let rSuit)):
    return lSuit == rSuit && lRank == rRank
  case (.straight(high: let lRank), .straight(high: let rRank)):
    return lRank == rRank
  case (.threeOfAKind(three: let lRank), .threeOfAKind(three: let rRank)):
    return lRank == rRank
  case (.twoPair(high: let lHigh, low: let lLow, highCard: let lCard),
        .twoPair(high: let rHigh, low: let rLow, highCard: let rCard)):
    return lHigh == rHigh && lLow == rLow && lCard == rCard
  case (.onePair(let lPairRank, card1: let lCard1, card2: let lCard2, card3: let lCard3),
        .onePair(let rPairRank, card1: let rCard1, card2: let rCard2, card3: let rCard3)):
    return lPairRank == rPairRank && lCard1 == rCard1 && lCard2 == rCard2 && lCard3 == rCard3
  case (.highCard(let lCard), .highCard(let rCard)):
    return lCard == rCard
  default:
    return false
  }
}
Crafting a high-quality hash function for this enum would be similarly inconvenient to write, involving another large switchstatement.

Swift already provides implicit protocol conformance in some cases; notably, enums with raw values conform toRawRepresentable, Equatable, and Hashable without the user explicitly declaring them:

enum Foo: Int {
  case one = 1
  case two = 2
}

let x = (Foo.one == Foo.two) // works
let y = Foo.one.hashValue // also works
let z = Foo.one.rawValue // also also works
Since there is precedent for this in Swift, we propose extending this support to more value types.

Proposed solution

We propose that a value type be Equatable/Hashable if all of its members are Equatable/Hashable, with the result for the outer type being composed from its members.

Specifically, we propose the following rules for deriving Equatable:

A struct implicitly conforms to Equatable if all of its fields are of types that conform to Equatable – either explicitly, or implicitly by the application of these rules. The compiler will generate an implementation of ==(lhs: T, rhs: T)that returns true if and only if lhs.x == rhs.x for all fields x in T.

An enum implicitly conforms to Equatable if all of its associated values across all of its cases are of types that conform to Equatable – either explicitly, or implicitly by the application of these rules. The compiler will generate an implementation of ==(lhs: T, rhs: T) that returns true if and only if lhs and rhs are the same case and have payloads that are memberwise-equal.

Likewise, we propose the following rules for deriving Hashable:

A struct implicitly conforms to Hashable if all of its fields are of types that conform to Hashable – either explicitly, or implicitly by the application of these rules. The compiler will generate an implementation of hashValue that uses a pre-defined hash function† to compute the hash value of the struct from the hash values of its members.

Since order of the terms affects the hash value computation, we recommend ordering the terms in member definition order.

An enum implicitly conforms to Hashable if all of its associated values across all of its cases are of types that conform to Hashable – either explicitly, or implicitly by the application of these rules. The compiler will generate an implementation of hashValue that uses a pre-defined hash function† to compute the hash value of an enum value by using the case's ordinal (i.e., definition order) followed by the hash values of its associated values as its terms, also in definition order.

† We leave the exact definition of the hash function unspecified here; a multiplicative hash function such as Kernighan and Ritchie or Bernstein is easy to implement, but we do not rule out other possibilities.

Overriding defaults

Any user-provided implementations of == or hashValue should override the default implementations that would be provided by the compiler. This is already possible today with raw-value enums so the same behavior should be extended to other value types that are made to implicitly conform to these protocols.

Open questions

Omission of fields from generated computations

Should it be possible to easily omit certain properties from automatically generated equality tests or hash value computation? This could be valuable, for example, if a property is merely used as an internal cache and does not actually contribute to the "value" of the instance. Under the rules above, if this cached value was equatable, a user would have to override == and hashValue and provide their own implementations to ignore it. If there is significant evidence that this pattern is common and useful, we could consider adding a custom attribute, such as @transient, that would omit the property from the generated computations.

Explicit or implicit derivation

As with raw-value enums today, should the derived conformance be completely explicit, or should users have to explicitly list conformance with Equatable and Hashable in order for the compiler to generate the derived implementation?

Impact on existing code

This change will have no impact on existing code because it is purely additive. Value types that already provide custom implementations of == or hashValue but satisfy the rules above would keep the custom implementation because it would override the compiler-provided default.

Alternatives considered

The original discussion thread also included Comparable as a candidate for automatic generation. Unlike equatability and hashability, however, comparability requires an ordering among the members being compared. Automatically using the definition order here might be too surprising for users, but worse, it also means that reordering properties in the source code changes the code's behavior at runtime. (This is true for hashability as well if a multiplicative hash function is used, but hash values are not intended to be persistent and reordering the terms does not produce a significant behavioral change.)

Acknowledgments

Thanks to Joe Groff for spinning off the original discussion thread, Jose Cheyo Jimenez for providing great real-world examples of boilerplate needed to support equatability for some value types, and to Mark Sands for necromancing the swift-evolution thread that convinced me to write this up.

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution