Custom equality/hash for Sets


(Jacob Bandes-Storch) #1

Would it make sense for the standard library Set to provide variants (or
parallel versions of the same data structure) that take custom hashValue/==
implementations at init time (functions taking in Elements), rather than
relying on Hashable/Comparable protocols?

Use case: I want a set of objects that are compared for equality using ===
rather than ==. This doesn't seem possible today, using Set, without
creating some sort of wrapper object.

This particular case would be analogous to using NSHashTable with
NSPointerFunctionsObjectPointerPersonality. (Maybe all I'm asking for is a
Swiftier API for NSHashTable — including ArrayLiteralConvertible, using
generics instead of UnsafePointer<Void>, etc.)

Similarly, C++'s unordered_map
<http://en.cppreference.com/w/cpp/container/unordered_map> and friends have
template parameters specifying the hash function and equality comparator,
which use std::hash and == by default.

(Apologies if this has been discussed already; I haven't seen it.)
Jacob


(Dave Abrahams) #2

It might make sense. How bad is the wrapper solution for user code?

···

on Thu Feb 18 2016, Jacob Bandes-Storch <swift-evolution@swift.org> wrote:

Would it make sense for the standard library Set to provide variants (or
parallel versions of the same data structure) that take custom hashValue/==
implementations at init time (functions taking in Elements), rather than
relying on Hashable/Comparable protocols?

Use case: I want a set of objects that are compared for equality using ===
rather than ==. This doesn't seem possible today, using Set, without
creating some sort of wrapper object.

This particular case would be analogous to using NSHashTable with
NSPointerFunctionsObjectPointerPersonality. (Maybe all I'm asking for is a
Swiftier API for NSHashTable — including ArrayLiteralConvertible, using
generics instead of UnsafePointer<Void>, etc.)

Similarly, C++'s unordered_map
<http://en.cppreference.com/w/cpp/container/unordered_map> and friends have
template parameters specifying the hash function and equality comparator,
which use std::hash and == by default.

--
-Dave


(Dmitri Gribenko) #3

It might make sense, but we should keep in mind that doing this will
prevent inlining of hashValue/== in most cases.

Dmitri

···

On Thu, Feb 18, 2016 at 2:58 PM, Jacob Bandes-Storch via swift-evolution <swift-evolution@swift.org> wrote:

Would it make sense for the standard library Set to provide variants (or
parallel versions of the same data structure) that take custom hashValue/==
implementations at init time (functions taking in Elements), rather than
relying on Hashable/Comparable protocols?

--
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr@gmail.com>*/


#4

To me it seems like a future direction of the property behavior proposal.
There could be a behavior like "pointerEquatable" which is used as a transparent wrapper:

        bahavior pointerEquatable<Value: AnyObject> : Hashable {
                var value: Value

                var hashValue: Int {
                        // hash value from pointer
                }
                // omitting get/set
        }

        func == <T>(p1: pointerEquatable<T>, p2: pointerEquatable<T>) -> Bool {
                return p1.value === p2.value
        }

        // possible usage
        let set: Set<pointerEquatable<MyClass>> = [MyClass(), MyClass()]

This solution could then be used for any type not only Sets.

Best regards
- Maximilian

···

Am 18.02.2016 um 23:58 schrieb Jacob Bandes-Storch via swift-evolution <swift-evolution@swift.org>:

Would it make sense for the standard library Set to provide variants (or parallel versions of the same data structure) that take custom hashValue/== implementations at init time (functions taking in Elements), rather than relying on Hashable/Comparable protocols?

Use case: I want a set of objects that are compared for equality using === rather than ==. This doesn't seem possible today, using Set, without creating some sort of wrapper object.

This particular case would be analogous to using NSHashTable with NSPointerFunctionsObjectPointerPersonality. (Maybe all I'm asking for is a Swiftier API for NSHashTable — including ArrayLiteralConvertible, using generics instead of UnsafePointer<Void>, etc.)

Similarly, C++'s unordered_map and friends have template parameters specifying the hash function and equality comparator, which use std::hash and == by default.

(Apologies if this has been discussed already; I haven't seen it.)
Jacob
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Joe Groff) #5

At an implementation level, we already pass a bundle of function pointers for the Equatable and Hashable conformances as part of the generic calling convention. In the implementation model, protocol conformances are independent entities, so technically the same type could conform to the same protocol multiple times in different ways. We could theoretically let you provide a scoped private conformance to Hashable overriding the default one:

func foo() {
  extension Int: private Hashable {
    var hashValue: Int { return myHash(self) }
  }

  // This dictionary would use the local Int: Hashable
  let dict: Foo<Int, Int> = [...]

  use(dict)
}

Now, there would definitely be weird effects to this, since it allows for types that look the same but are formally different, since the `Key: Hashable` conformance is an implicit parameter to the type. For example, this wouldn't work:

func returnMyWeirdDictionary() -> Dictionary<Int, Int> {
  extension Int: private Hashable {
    var hashValue: Int { return myHash(self) }
  }

  // This dictionary would use the local Int: Hashable
  let dict: Foo<Int, Int> = [...]

  // ERROR: dict has type Dictionary<Int, Int> (using local Int: Hashable conformance),
  // but return type is declared Dictionary<Int, Int> (using default Int: Hashable conformance)
  return dict
}

It also has interesting interactions with the runtime's support for protocol conformance lookup via `x as? P` casts—if there's more than one P conformance for the dynamic type of x, which one should the runtime pick?

-Joe

···

On Feb 18, 2016, at 2:58 PM, Jacob Bandes-Storch via swift-evolution <swift-evolution@swift.org> wrote:

Would it make sense for the standard library Set to provide variants (or parallel versions of the same data structure) that take custom hashValue/== implementations at init time (functions taking in Elements), rather than relying on Hashable/Comparable protocols?

Use case: I want a set of objects that are compared for equality using === rather than ==. This doesn't seem possible today, using Set, without creating some sort of wrapper object.

This particular case would be analogous to using NSHashTable with NSPointerFunctionsObjectPointerPersonality. (Maybe all I'm asking for is a Swiftier API for NSHashTable — including ArrayLiteralConvertible, using generics instead of UnsafePointer<Void>, etc.)

Similarly, C++'s unordered_map <http://en.cppreference.com/w/cpp/container/unordered_map> and friends have template parameters specifying the hash function and equality comparator, which use std::hash and == by default.

(Apologies if this has been discussed already; I haven't seen it.)
Jacob


(plx) #6

I think the points already-made about the weak semantics of such a type are well-founded and should curtail the idea of changing `Set` itself to take in optional “customized” functions for equality/hash.

However, this scenario, I think, makes an *excellent* testbed for Swift 3’s hopefully-still-incoming "conditional conformance” functionality (e.g. the thing that would let you say `Array:Equatable where Element:Equatable`).

Within Swift as it is right now, if you really want a “customizable set” as-requested, about the best you can do at this time is like so:

// step 1: a convenience protocol
protocol SetValueWrapper : Equatable, Hashable {
  typealias Value // note: it’s useful to require `:Equatable`, but not necessary
  var value: Value { get }
  init(_ value: Value)
}

// step 2: a convenience wrapper around `Set`
//
// re-implement as much of the `Set` APIs as you need,
// but in a way that lets you ignore internal use of `W`
//
// Note that in practice you may need to write this as
// struct WrappedValueSet<V,W:SetValueWrapper where V==W.Value>,
// as I’ve run into bugs where the compiler needs that `V` to figure things out.
// I’ve written it as it should be, not how it may need to be to use today.
struct WrappedValueSet<W:SetValueWrapper> {
   private var storage: Set<W>

  // example re-implementations:
  //
  func contains(element: W.Element) -> Bool { return self.storage.contains(W(element))

  mutating func insert(element: W.Element) { self.storage.insert(W(element))

}

// step 3: per each customized equality/hash you need, write a wrapper;
// e.g., here is a complete “ObjectPointer” wrapper as per the original request:
struct ObjectPointer<T:AnyObject> : SetValueWrapper {

  typealias Value: T

  let value: T
  
  init(_ value: T) { self.value = value }

  var hashValue: Int { return ObjectIdentifier(value).hashValue

}

func ==<T>(lhs: ObjectPointer<T>, rhs: ObjectPointer<T>) -> Bool {
  return lhs.value === rhs.value
}

…and you’re done; the cost is basically one tedious session of re-implementing the Set-related APIs you want on your wrapper, and then one (short!) wrapper for each custom equality/hash combo you need.

This isn’t *great*, but it seems perfectly-reasonable to me when weighed against the drawbacks of a `Set`-like thing that took custom logic in its `init`.

With conditional-conformances in place, you can also improve your quality of life a lot; it’s a bit tricky, but you could — if conditional conformances work as I expect they will — use some trickery to “punch out” `Equatable` and `Hashable`, like so:

/// Basic protocol for “this is a wrapper”.
protocol ValueWrapper {
  typealias Value
  var value: Value { get }
  init(_ value: Value)
}

/// Specialized-wrapper-of-wrapper that is meant to source:
/// - Equatable, Hashable from the wrapped-value
/// - everything else (as needed) from the wrapped-value’s wrapped-value
///
/// …which means we can keep adding utility conformances here based on `W.Value`’s implementations,
/// while still having “punched-out” W’s native possible `==` and `hashValue` implementations in favor of
/// whatever implementations thereof are supplied by W.
struct WrapperWrapper<W:protocol<WrappedValue,Equatable,Hashable> where W.Value:WrappedValue> : WrappedValue {

  typealias Value = W.Value // note it somewhat hides the existence of the inner wrapper
  
  private let storage: W
  var value: Value { return self.storage.value }

  init(_ value: Value) { self.storage = W(value) }
  
  // note it uses the wrapper’s (customized) hashValue implementation
  var hashValue: Int {
   return self.storage.hashValue
  }

}

// note again this uses the *wrapper*’s implementation:
func ==<W>(lhs: WrapperWrapper<W>, rhs: WrapperWrapper<W>) -> Bool {
  return lhs.storage == rhs.storage // use wrapper’s (customized) equality
}

// but now we can start adding nice-to-have conformances based on `W.Value`
extension WrapperWrapper:CustomStringConvertible where W.Value:CustomStringConvertible {
  
  var description: String { return self.value.description }

}

// and so on and so forth, as-needed...

Although this still doesn't free you up from writing the “custom ==/hash wrappers”, if you rewrite the set-wrapper from step 2 in terms of a `WrapperWrapper` (hopefully with a better name!), then the values that are stored in the Set will pick up protocol conformances of interest from the underlying value, and thus trigger any conditional-conformances that are defined e.g. on `Set`, making it easy for you to add them to your own wrapper if you want, and so on.

Is this *great*? Arguably not, but I think it’s a reasonable situation, especially since the tedious parts (set-wrapper, wrapper-wrapper) are each write-once, re-use often, and the per-customization chores are really short (~5-10 lines, mostly boilerplate).

And again, you don’t *need* the set-wrapper or wrapper-wrapper, they just streamline the sites-of-use of such constructs.

Some form of actual “struct inheritance” might reduce the need to manually emulate it with protocols like the above, but protocols + conditional-conformance let you emulate enough of that feature to work out “OK” in this case, I think.

···

On Feb 18, 2016, at 4:58 PM, Jacob Bandes-Storch via swift-evolution <swift-evolution@swift.org> wrote:

Would it make sense for the standard library Set to provide variants (or parallel versions of the same data structure) that take custom hashValue/== implementations at init time (functions taking in Elements), rather than relying on Hashable/Comparable protocols?

Use case: I want a set of objects that are compared for equality using === rather than ==. This doesn't seem possible today, using Set, without creating some sort of wrapper object.

This particular case would be analogous to using NSHashTable with NSPointerFunctionsObjectPointerPersonality. (Maybe all I'm asking for is a Swiftier API for NSHashTable — including ArrayLiteralConvertible, using generics instead of UnsafePointer<Void>, etc.)

Similarly, C++'s unordered_map <http://en.cppreference.com/w/cpp/container/unordered_map> and friends have template parameters specifying the hash function and equality comparator, which use std::hash and == by default.

(Apologies if this has been discussed already; I haven't seen it.)
Jacob
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Jordan Rose) #7

struct CustomHashableFoo: Hashable {
  var value: Foo
  func hash() -> Int {
    // custom hash function here
  }
}
func ==(a: CustomHashableWrapped, b: CustomHashableWrapped) {
  // custom == here
}

Really not that bad, although you do have to get 'value' in and out of the box. It's also not reusable code—you have to rewrite the box for every type.

I'd say you usually don't want to allow custom hash/== closures because (a) then you have to store them somewhere, and (b) the compiler won't usually be able to inline them away. You also end up with a Set<Foo> that doesn't behave like a normal Set<Foo>—maybe you can insert equal-but-not-identical elements—which is bad if anyone's relying on normal Set-like guarantees.

-1 from me until we can put functions in generics, if ever.

Jordan

···

On Feb 18, 2016, at 16:09 , Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:

on Thu Feb 18 2016, Jacob Bandes-Storch <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

Would it make sense for the standard library Set to provide variants (or
parallel versions of the same data structure) that take custom hashValue/==
implementations at init time (functions taking in Elements), rather than
relying on Hashable/Comparable protocols?

Use case: I want a set of objects that are compared for equality using ===
rather than ==. This doesn't seem possible today, using Set, without
creating some sort of wrapper object.

This particular case would be analogous to using NSHashTable with
NSPointerFunctionsObjectPointerPersonality. (Maybe all I'm asking for is a
Swiftier API for NSHashTable — including ArrayLiteralConvertible, using
generics instead of UnsafePointer<Void>, etc.)

Similarly, C++'s unordered_map
<http://en.cppreference.com/w/cpp/container/unordered_map> and friends have
template parameters specifying the hash function and equality comparator,
which use std::hash and == by default.

It might make sense. How bad is the wrapper solution for user code?


(Thorsten Seitz) #8

Needing a Set of identical reference objects, i.e. comparing with ===, has come up quite regularly in the projects I have worked on (as well as indexing by identity in a Dictionary). This might be mitigated in Swift due to a more prominent role of value types but I would still expect it to be useful.

-Thorsten

···

Am 19.02.2016 um 05:39 schrieb Dmitri Gribenko via swift-evolution <swift-evolution@swift.org>:

On Thu, Feb 18, 2016 at 2:58 PM, Jacob Bandes-Storch via > swift-evolution <swift-evolution@swift.org> wrote:

Would it make sense for the standard library Set to provide variants (or
parallel versions of the same data structure) that take custom hashValue/==
implementations at init time (functions taking in Elements), rather than
relying on Hashable/Comparable protocols?

It might make sense, but we should keep in mind that doing this will
prevent inlining of hashValue/== in most cases.

Dmitri

--
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr@gmail.com>*/
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Andrew Bennett) #9

*Wrapper Solution?*

I've had a go at a wrapper solution before and it seems to either need a
unique type per sort, or a block stored per element (unstable).

Similar overheads were discussed when an index needs to store a reference
to the parent. There's some work to fix it that makes indices moveable, so
instead of going index.successor() you use collection.next(index).

https://github.com/apple/swift/blob/master/test/Prototypes/CollectionsMoveIndices.swift

*Potential solution:*

The collection interfaces could change like this:

- struct Set<Element: Hashable> {
- struct Set<Element> {
    ...
- public init() { ... }
+ public init<H: Hashable>(elementHasher: Element -> H) {
      ...
    }
- public init(minimumCapacity: Int) { ... }
+ public init<H: Hashable>(minimumCapacity: Int, elementHasher: Element
-> H) {
      ...
    }
    ... (same for other public initialisers)
  }

Then the Hashable specific initialisers added back like this:

+ extension Set where Element: Hashable {
+ public init() { ... }
+ public init(minimumCapacity: Int) { ... }
+ }

*Potential Implementation*

I've had a quick look at `HashedCollections.swift.gyb`, there's currently
two underlying implementations ObjC and Swift Native, it may be possible to
specialise (gyb) this Native implementation into two implementations
(HashableNative, ClosureNative).

The specialisation is probably just a one liner in `_bucket()` and
`hashValue` to use a different method to get the hash.

I don't think this would break any code, except new initialisers that may
need `where Hashable` added. I think it will provide the desired
functionality.

It may even be a fairly straightforward implementation (famous last words),
although I haven't got time to do much more at the moment.

···

On Friday, 19 February 2016, Dave Abrahams via swift-evolution < swift-evolution@swift.org> wrote:

on Thu Feb 18 2016, Jacob Bandes-Storch <swift-evolution@swift.org> wrote:

> Would it make sense for the standard library Set to provide variants (or
> parallel versions of the same data structure) that take custom
hashValue/==
> implementations at init time (functions taking in Elements), rather than
> relying on Hashable/Comparable protocols?
>
> Use case: I want a set of objects that are compared for equality using

> rather than ==. This doesn't seem possible today, using Set, without
> creating some sort of wrapper object.
>
> This particular case would be analogous to using NSHashTable with
> NSPointerFunctionsObjectPointerPersonality. (Maybe all I'm asking for is
a
> Swiftier API for NSHashTable — including ArrayLiteralConvertible, using
> generics instead of UnsafePointer<Void>, etc.)
>
> Similarly, C++'s unordered_map
> <http://en.cppreference.com/w/cpp/container/unordered_map> and friends
have
> template parameters specifying the hash function and equality comparator,
> which use std::hash and == by default.

It might make sense. How bad is the wrapper solution for user code?

--
-Dave

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Jacob Bandes-Storch) #10

Thanks to everyone for a great discussion so far.

I'm interested in Joe's idea of private/local protocol conformances. What I
think might work nicely is the ability to provide a specific protocol
witness (which I think is the right terminology for the "bundle of function
pointers") for a conformance requirement of a type parameter. This would be
*part of the type*, so like Joe said, "set1 == set2" wouldn't work unless
they were using the same witness. That's totally fine and would be expected
behavior, as long as the difference between the types is obvious to the
user. I think you'd want to require it be specified explicitly, rather than
inferring it from the existence of some local extension.

Maximilian's point about behaviors makes sense in this context too, and I
think we're kind of getting at the same idea. Instead of a "var behavior",
it's more like a "protocol behavior":

    // All protocol requirements must be specified here; this declaration
basically provides an explicit protocol witness
    private *protocol behavior* pointerEquatable<AnyObject>: Hashable,
Equatable {
        var hashValue: Int { return unsafeAddressOf(self).hashValue }
        func ==(lhs: Self, rhs: Self) -> Bool { return lhs === rhs }
    }

    // This uses the witness provided by pointerEquatable instead of the
default one
    let objs = Set<@pointerEquatable MyClass> = []
    objs.insert(something)

Does this defeat possible optimizations, or cause issues like Joe mentioned
with "as?" lookups?

···

On Fri, Feb 19, 2016 at 11:11 AM, Joe Groff <jgroff@apple.com> wrote:

On Feb 18, 2016, at 2:58 PM, Jacob Bandes-Storch via swift-evolution < > swift-evolution@swift.org> wrote:

Would it make sense for the standard library Set to provide variants (or
parallel versions of the same data structure) that take custom hashValue/==
implementations at init time (functions taking in Elements), rather than
relying on Hashable/Comparable protocols?

Use case: I want a set of objects that are compared for equality using ===
rather than ==. This doesn't seem possible today, using Set, without
creating some sort of wrapper object.

This particular case would be analogous to using NSHashTable with
NSPointerFunctionsObjectPointerPersonality. (Maybe all I'm asking for is a
Swiftier API for NSHashTable — including ArrayLiteralConvertible, using
generics instead of UnsafePointer<Void>, etc.)

Similarly, C++'s unordered_map
<http://en.cppreference.com/w/cpp/container/unordered_map> and friends
have template parameters specifying the hash function and equality
comparator, which use std::hash and == by default.

(Apologies if this has been discussed already; I haven't seen it.)
Jacob

At an implementation level, we already pass a bundle of function pointers
for the Equatable and Hashable conformances as part of the generic calling
convention. In the implementation model, protocol conformances are
independent entities, so technically the same type could conform to the
same protocol multiple times in different ways. We could theoretically let
you provide a scoped private conformance to Hashable overriding the default
one:

func foo() {
  extension Int: private Hashable {
    var hashValue: Int { return myHash(self) }
  }

  // This dictionary would use the local Int: Hashable
  let dict: Foo<Int, Int> = [...]

  use(dict)
}

Now, there would definitely be weird effects to this, since it allows for
types that look the same but are formally different, since the `Key:
Hashable` conformance is an implicit parameter to the type. For example,
this wouldn't work:

func returnMyWeirdDictionary() -> Dictionary<Int, Int> {
  extension Int: private Hashable {
    var hashValue: Int { return myHash(self) }
  }

  // This dictionary would use the local Int: Hashable
  let dict: Foo<Int, Int> = [...]

  // ERROR: dict has type Dictionary<Int, Int> (using local Int: Hashable
conformance),
  // but return type is declared Dictionary<Int, Int> (using default Int:
Hashable conformance)
  return dict
}

It also has interesting interactions with the runtime's support for
protocol conformance lookup via `x as? P` casts—if there's more than one P
conformance for the dynamic type of x, which one should the runtime pick?

-Joe


(TJ Usiyan) #11

What do you mean by "put functions in generics"?

TJ

···

On Thu, Feb 18, 2016 at 11:04 PM, Jordan Rose via swift-evolution < swift-evolution@swift.org> wrote:

On Feb 18, 2016, at 16:09 , Dave Abrahams via swift-evolution < > swift-evolution@swift.org> wrote:

on Thu Feb 18 2016, Jacob Bandes-Storch <swift-evolution@swift.org> wrote:

Would it make sense for the standard library Set to provide variants (or
parallel versions of the same data structure) that take custom hashValue/==
implementations at init time (functions taking in Elements), rather than
relying on Hashable/Comparable protocols?

Use case: I want a set of objects that are compared for equality using ===
rather than ==. This doesn't seem possible today, using Set, without
creating some sort of wrapper object.

This particular case would be analogous to using NSHashTable with
NSPointerFunctionsObjectPointerPersonality. (Maybe all I'm asking for is a
Swiftier API for NSHashTable — including ArrayLiteralConvertible, using
generics instead of UnsafePointer<Void>, etc.)

Similarly, C++'s unordered_map
<http://en.cppreference.com/w/cpp/container/unordered_map> and friends
have
template parameters specifying the hash function and equality comparator,
which use std::hash and == by default.

It might make sense. How bad is the wrapper solution for user code?

struct CustomHashableFoo: Hashable {
  var value: Foo
  func hash() -> Int {
    // custom hash function here
  }
}
func ==(a: CustomHashableWrapped, b: CustomHashableWrapped) {
  // custom == here
}

Really not that bad, although you do have to get 'value' in and out of the
box. It's also not reusable code—you have to rewrite the box for every type.

I'd say you usually *don't* want to allow custom hash/== closures because
(a) then you have to store them somewhere, and (b) the compiler won't
usually be able to inline them away. You also end up with a Set<Foo> that
doesn't behave like a normal Set<Foo>—maybe you can insert
equal-but-not-identical elements—which is bad if anyone's relying on normal
Set-like guarantees.

-1 from me until we can put functions in generics, if ever.

Jordan

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Dmitri Gribenko) #12

How do you compare such sets? That is, what does s1 == s2 mean, if
the two sets are constructed with a hashing closure? We can't even
compare closures to find that they are the same.

Also, how would this affect algorithms on sets? When handling a set,
you essentially wouldn't know the rules according to which it
operates, unless we expose this mapping function.

Dmitri

···

On Fri, Feb 19, 2016 at 2:06 AM, Andrew Bennett via swift-evolution <swift-evolution@swift.org> wrote:

Wrapper Solution?

I've had a go at a wrapper solution before and it seems to either need a
unique type per sort, or a block stored per element (unstable).

Similar overheads were discussed when an index needs to store a reference to
the parent. There's some work to fix it that makes indices moveable, so
instead of going index.successor() you use collection.next(index).

https://github.com/apple/swift/blob/master/test/Prototypes/CollectionsMoveIndices.swift

Potential solution:

The collection interfaces could change like this:

- struct Set<Element: Hashable> {
- struct Set<Element> {
    ...
- public init() { ... }
+ public init<H: Hashable>(elementHasher: Element -> H) {
      ...
    }

--
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr@gmail.com>*/


(Sean Heber) #13

I constantly run into this and I’m working on a Swift-only game project where I use value types as much as possible - but even there I find myself having to implement Equatable and Hashable on classes all the time just to add an == operator that just calls through to === / ObjectIdentity to get it to the work the way I need it to. Swift does a pretty good job eliminating boilerplate, but this is one area where it seems to stick out.

It might be potentially awesome to be able to do something like this and have the compiler generate an == function for me using the given function:

class MyClass: Equatable(===) {
// etc..
}

But there are likely ramifications to this that I’m not really thinking about.

l8r
Sean

···

On Feb 19, 2016, at 9:13 AM, Thorsten Seitz via swift-evolution <swift-evolution@swift.org> wrote:

Needing a Set of identical reference objects, i.e. comparing with ===, has come up quite regularly in the projects I have worked on (as well as indexing by identity in a Dictionary). This might be mitigated in Swift due to a more prominent role of value types but I would still expect it to be useful.

-Thorsten

Am 19.02.2016 um 05:39 schrieb Dmitri Gribenko via swift-evolution <swift-evolution@swift.org>:

On Thu, Feb 18, 2016 at 2:58 PM, Jacob Bandes-Storch via >> swift-evolution <swift-evolution@swift.org> wrote:

Would it make sense for the standard library Set to provide variants (or
parallel versions of the same data structure) that take custom hashValue/==
implementations at init time (functions taking in Elements), rather than
relying on Hashable/Comparable protocols?

It might make sense, but we should keep in mind that doing this will
prevent inlining of hashValue/== in most cases.

Dmitri

--
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr@gmail.com>*/
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Nicola Salmoria) #14

Would it make sense to abstract the hash functions into a protocol?

The limit of the Hashable protocol is that it forces a type to commit to a specific ‘canonical’ hash implementation, while there might be more efficient implementations depending on the application.

To truly support composition patterns, it would seem appropriate for the standard library to support decoupling the hashing functionality from the type of the container element.

E.g.

protocol Hash {
    typealias Element

    static func equal(lhs: Element, _ rhs: Element) -> Bool
    static func hashValue(x: Element) -> Int
}

// now we can define a generic box
struct HashableBox<T, H: Hash where H.Element == T> : Hashable {
    let value: T
    
    var hashValue: Int {
        return H.hashValue(value)
    }
}
func ==<T, H>(lhs: HashableBox<T, H>, rhs: HashableBox<T, H>) -> Bool {
    return H.equal(lhs.value, rhs.value)
}

// now we can define a wrapper around Set, using an explicit Hash
struct CustomHashSet<Element, H: Hash where H.Element == Element> {
    // this is an implementation detail; if CustomHashSet was part of the standard library,
    // the .gyb could produce a more efficient implementation
    private var set: Set<HashableBox<Element, H>>
    
    init() {}
    init<S: SequenceType where S.Generator.Element == Element>(_ sequence: S) {
        set = Set(sequence.lazy.map { HashableBox<Element, H>(value: $0) })
    }
    // etc.
}

// Example usage
struct NaiveCollectionHash<C: CollectionType where C.Generator.Element: Hashable>: Hash {
    typealias Element = C
    
    static func equal(lhs: Element, _ rhs: Element) -> Bool {
        guard lhs.count == rhs.count else { return false }
        return zip(lhs, rhs).lazy.reduce(true) { $0 && ($1.0 == $1.1) }
    }
    static func hashValue(x: Element) -> Int {
        return x.reduce(0) { $0 ^ $1.hashValue } // a bad hash just as an example
    }
}

let arrays = [[1],[2]]

// This doesn't work because Array is not Hashable (yet)
var s1 = Set(arrays) // Error

// This works but we need to explicitly specify all the types
var s2 = CustomHashSet<Array<Int>, NaiveCollectionHash<Array<Int>>>(arrays) // ok

// Generic typealiases would help reducing the boilerplate, e.g.
// it would be useful if we could do
typealias NaiveHashSet<T> = CustomHashSet<T, NaiveCollectionHash<T>> // Error
// then Swift's type inference should be able to automatically handle this:
var s3 = NaiveHashSet(arrays) // Error

Nicola

···

Cc:swift-evolution<swift-evolution@swift.org>
Subject:[swift-evolution] Custom equality/hash for Sets
Date:19 February 2016 at 05:04:51 GMT+1

> On Feb 18, 2016, at 16:09 , Dave Abrahams via swift-evolution<swift-evolution@swift.org(mailto:swift-evolution@swift.org)>wrote:
>
> on Thu Feb 18 2016, Jacob Bandes-Storch<swift-evolution@swift.org(mailto:swift-evolution@swift.org)>wrote:
>
> > Would it make sense for the standard library Set to provide variants (or
> > parallel versions of the same data structure) that take custom hashValue/==
> > implementations at init time (functions taking in Elements), rather than
> > relying on Hashable/Comparable protocols?
> >
> > Use case: I want a set of objects that are compared for equality using ===
> > rather than ==. This doesn't seem possible today, using Set, without
> > creating some sort of wrapper object.
> >
> > This particular case would be analogous to using NSHashTable with
> > NSPointerFunctionsObjectPointerPersonality. (Maybe all I'm asking for is a
> > Swiftier API for NSHashTable — including ArrayLiteralConvertible, using
> > generics instead of UnsafePointer<Void>, etc.)
> >
> > Similarly, C++'s unordered_map
> > <http://en.cppreference.com/w/cpp/container/unordered_map>and friends have
> > template parameters specifying the hash function and equality comparator,
> > which use std::hash and == by default.
>
> It might make sense.How bad is the wrapper solution for user code?
> struct CustomHashableFoo: Hashable {
> var value: Foo
> func hash() ->Int {
> // custom hash function here
> }
> }
> func ==(a: CustomHashableWrapped, b: CustomHashableWrapped) {
> // custom == here
> }
Really not that bad, although you do have to get 'value' in and out of the box. It's also not reusable code—you have to rewrite the box for every type.

I'd say you usuallydon'twant to allow custom hash/== closures because (a) then you have to store them somewhere, and (b) the compiler won't usually be able to inline them away. You also end up with a Set<Foo>that doesn't behave like a normal Set<Foo>—maybe you can insert equal-but-not-identical elements—which is bad if anyone's relying on normal Set-like guarantees.

-1 from me until we can put functions in generics, if ever.

Jordan_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Joe Groff) #15

You could provide a default implementation constrained on `AnyObject`:

func ==<T: AnyObject>(x: T, y: T) -> Bool { return x === y }

Then you just need to declare `class MyClass: Equatable` to pick up that implementation for Equatable conformance.

-Joe

···

On Feb 19, 2016, at 7:45 AM, Sean Heber via swift-evolution <swift-evolution@swift.org> wrote:

I constantly run into this and I’m working on a Swift-only game project where I use value types as much as possible - but even there I find myself having to implement Equatable and Hashable on classes all the time just to add an == operator that just calls through to === / ObjectIdentity to get it to the work the way I need it to. Swift does a pretty good job eliminating boilerplate, but this is one area where it seems to stick out.

It might be potentially awesome to be able to do something like this and have the compiler generate an == function for me using the given function:

class MyClass: Equatable(===) {
// etc..
}

But there are likely ramifications to this that I’m not really thinking about.


(Andrew Bennett) #16

Good points Dimitri,

I only wrote this quickly, so please look for other issues too, I don't
expect it to be complete by any means.

I think a way to get an element's hashValue should probably be exposed,
similar to `collection.next(index)` that I mentioned earlier. I think this
is useful, but I don't think it's actually necessary. Equatable is
sufficient to provide Set functionality, Hashable is just for performance

I'm not aware of any algorithms that can't be built on top of the existing
set methods that would require the same hash value the element used for
binning.

I was initially thinking that you could compare the Hashable to determine
equality, but I agree that there is a potential for the closure to convert
to a narrower type.

A much stronger solution would change the interface slightly to require
Element to be `Equatable`, and provide `Element -> Int` as the closure.
It's slightly less flexibly, but I do agree that a narrowing makes
Equatability of s1 == s2 ambiguous.

*In summary*:

- struct Set<Element: Hashable> {
- struct Set<Element: Equatable> {
    ...
- public init() { ... }
+ public init(hashValueForElement: Element -> Int) {
      ...
    }
- public init(minimumCapacity: Int) { ... }
+ public init<H: Hashable>(minimumCapacity: Int, hashValueForElement:
Element -> Int) {
      ...
    }
    ... (same for other public initialisers)
+ public func hashValueForElement(element: Element) -> Int { ... }
  }

+ extension Set where Element: Hashable {
+ public init() { ... }
+ public init(minimumCapacity: Int) { ... }
+ }

···

-
Andrew Bennett

On Fri, Feb 19, 2016 at 7:14 PM, Dmitri Gribenko <gribozavr@gmail.com> wrote:

On Fri, Feb 19, 2016 at 2:06 AM, Andrew Bennett via swift-evolution > <swift-evolution@swift.org> wrote:
> Wrapper Solution?
>
> I've had a go at a wrapper solution before and it seems to either need a
> unique type per sort, or a block stored per element (unstable).
>
> Similar overheads were discussed when an index needs to store a
reference to
> the parent. There's some work to fix it that makes indices moveable, so
> instead of going index.successor() you use collection.next(index).
>
>
https://github.com/apple/swift/blob/master/test/Prototypes/CollectionsMoveIndices.swift
>
> Potential solution:
>
> The collection interfaces could change like this:
>
> - struct Set<Element: Hashable> {
> - struct Set<Element> {
> ...
> - public init() { ... }
> + public init<H: Hashable>(elementHasher: Element -> H) {
> ...
> }

How do you compare such sets? That is, what does s1 == s2 mean, if
the two sets are constructed with a hashing closure? We can't even
compare closures to find that they are the same.

Also, how would this affect algorithms on sets? When handling a set,
you essentially wouldn't know the rules according to which it
operates, unless we expose this mapping function.

Dmitri

--
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr@gmail.com>*/


(Dave Abrahams) #17

Would it make sense for the standard library Set to provide variants (or
parallel versions of the same data structure) that take custom hashValue/==
implementations at init time (functions taking in Elements), rather than
relying on Hashable/Comparable protocols?

Use case: I want a set of objects that are compared for equality using ===
rather than ==. This doesn't seem possible today, using Set, without
creating some sort of wrapper object.

This particular case would be analogous to using NSHashTable with
NSPointerFunctionsObjectPointerPersonality. (Maybe all I'm asking for is a
Swiftier API for NSHashTable — including ArrayLiteralConvertible, using
generics instead of UnsafePointer<Void>, etc.)

Similarly, C++'s unordered_map
<http://en.cppreference.com/w/cpp/container/unordered_map
<http://en.cppreference.com/w/cpp/container/unordered_map>> and
friends have
template parameters specifying the hash function and equality

comparator,

which use std::hash and == by default.

It might make sense. How bad is the wrapper solution for user code?

struct CustomHashableFoo: Hashable {
  var value: Foo
  func hash() -> Int {
    // custom hash function here
  }
}
func ==(a: CustomHashableWrapped, b: CustomHashableWrapped) {
  // custom == here
}

Really not that bad, although you do have to get 'value' in and out of
the box.

But that's the part I'm worried about.

It's also not reusable code—you have to rewrite the box for every
type.

That's at least partially solvable:

protocol CustomHash {
  associatedtype Base
  static func hash(Base) -> Int
  static func equal(Base, Base) -> Bool
}

struct CustomHashed<T: CustomHash> : Hashable {
  let value: T.Base
  func hash() -> Int { return T.hash(value) }
}

func == <T>(
  _ lhs: CustomHashed<T>,
  _ rhs: CustomHashed<T>
) -> Bool {
  return T.equal(lhs, rhs)
}

// ---

// More compact than CustomHashableFoo at least.
struct MyFooHash : CustomHash {
  static func hash(Foo) -> Int { ... }
  static func equal(Foo, Foo) -> Bool { ... }
}

let s: Set<CustomHashed<MyFooHash>> = [
  CustomHashed(myFoo1), CustomHashed(myFoo2)
]

I'd say you usually don't want to allow custom hash/== closures
because (a) then you have to store them somewhere, and (b) the
compiler won't usually be able to inline them away. You also end up
with a Set<Foo> that doesn't behave like a normal Set<Foo>—maybe you
can insert equal-but-not-identical elements—which is bad if anyone's
relying on normal Set-like guarantees.

Not to mention that we don't know how to implement equality of closures.

If we were going to parameterize Set somehow, it would have to be

  CustomSet<T, H: CustomHash where H.Base == T>

and we'd want the ability to have a default for H when T conforms to
Hashable, which is currently beyond the language.

···

on Thu Feb 18 2016, Jordan Rose <swift-evolution@swift.org> wrote:

On Feb 18, 2016, at 16:09 , Dave Abrahams via swift-evolution >> <swift-evolution@swift.org> wrote:
on Thu Feb 18 2016, Jacob Bandes-Storch > >> <swift-evolution@swift.org >> <mailto:swift-evolution@swift.org>> >> wrote:

--
-Dave


(Dave Abrahams) #18

This is basically
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2098.pdf IIUC.
Just to be clear, that is a generics feature that we're not even certain
we want, and regardless, comes after a *long* list of other generics
features of higher priority.

···

on Fri Feb 19 2016, Jacob Bandes-Storch <swift-evolution@swift.org> wrote:

Thanks to everyone for a great discussion so far.

I'm interested in Joe's idea of private/local protocol conformances.

--
-Dave


(Jordan Rose) #19

Uh, it wasn't particularly thought through, but basically some way to tie the type of the Set to the implementations of == and hash. This could be something like C++ non-type template parameters, or a way to provide a custom conformance ("use this implementation of Hashable instead of the one that comes with the type") a la ML.*

I don't think either of these are near-term features, or possibly even far-term features. It was mostly just a way to say "this belongs in the type system for both correctness and performance reasons".

Jordan

* I don't actually grok <https://en.wiktionary.org/wiki/grok> ML modules yet.

···

On Feb 18, 2016, at 20:23 , T.J. Usiyan <griotspeak@gmail.com <mailto:griotspeak@gmail.com>> wrote:

What do you mean by "put functions in generics"?

TJ

On Thu, Feb 18, 2016 at 11:04 PM, Jordan Rose via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

On Feb 18, 2016, at 16:09 , Dave Abrahams via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

on Thu Feb 18 2016, Jacob Bandes-Storch <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

Would it make sense for the standard library Set to provide variants (or
parallel versions of the same data structure) that take custom hashValue/==
implementations at init time (functions taking in Elements), rather than
relying on Hashable/Comparable protocols?

Use case: I want a set of objects that are compared for equality using ===
rather than ==. This doesn't seem possible today, using Set, without
creating some sort of wrapper object.

This particular case would be analogous to using NSHashTable with
NSPointerFunctionsObjectPointerPersonality. (Maybe all I'm asking for is a
Swiftier API for NSHashTable — including ArrayLiteralConvertible, using
generics instead of UnsafePointer<Void>, etc.)

Similarly, C++'s unordered_map
<http://en.cppreference.com/w/cpp/container/unordered_map> and friends have
template parameters specifying the hash function and equality comparator,
which use std::hash and == by default.

It might make sense. How bad is the wrapper solution for user code?

struct CustomHashableFoo: Hashable {
  var value: Foo
  func hash() -> Int {
    // custom hash function here
  }
}
func ==(a: CustomHashableWrapped, b: CustomHashableWrapped) {
  // custom == here
}

Really not that bad, although you do have to get 'value' in and out of the box. It's also not reusable code—you have to rewrite the box for every type.

I'd say you usually don't want to allow custom hash/== closures because (a) then you have to store them somewhere, and (b) the compiler won't usually be able to inline them away. You also end up with a Set<Foo> that doesn't behave like a normal Set<Foo>—maybe you can insert equal-but-not-identical elements—which is bad if anyone's relying on normal Set-like guarantees.

-1 from me until we can put functions in generics, if ever.

Jordan

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution


(TJ Usiyan) #20

Thank you for the clarification!

···

On Thu, Feb 18, 2016 at 11:33 PM, Jordan Rose <jordan_rose@apple.com> wrote:

Uh, it wasn't particularly thought through, but basically some way to tie
the type of the Set to the implementations of == and hash. This could be
something like C++ non-type template parameters, or a way to provide a
custom conformance ("use *this* implementation of Hashable instead of the
one that comes with the type") a la ML.*

I don't think either of these are near-term features, or possibly even
far-term features. It was mostly just a way to say "this belongs in the
type system for both correctness and performance reasons".

Jordan

* I don't actually grok <https://en.wiktionary.org/wiki/grok> ML modules
yet.

On Feb 18, 2016, at 20:23 , T.J. Usiyan <griotspeak@gmail.com> wrote:

What do you mean by "put functions in generics"?

TJ

On Thu, Feb 18, 2016 at 11:04 PM, Jordan Rose via swift-evolution < > swift-evolution@swift.org> wrote:

On Feb 18, 2016, at 16:09 , Dave Abrahams via swift-evolution < >> swift-evolution@swift.org> wrote:

on Thu Feb 18 2016, Jacob Bandes-Storch <swift-evolution@swift.org> >> wrote:

Would it make sense for the standard library Set to provide variants (or
parallel versions of the same data structure) that take custom
hashValue/==
implementations at init time (functions taking in Elements), rather than
relying on Hashable/Comparable protocols?

Use case: I want a set of objects that are compared for equality using ===
rather than ==. This doesn't seem possible today, using Set, without
creating some sort of wrapper object.

This particular case would be analogous to using NSHashTable with
NSPointerFunctionsObjectPointerPersonality. (Maybe all I'm asking for is a
Swiftier API for NSHashTable — including ArrayLiteralConvertible, using
generics instead of UnsafePointer<Void>, etc.)

Similarly, C++'s unordered_map
<http://en.cppreference.com/w/cpp/container/unordered_map> and friends
have
template parameters specifying the hash function and equality comparator,
which use std::hash and == by default.

It might make sense. How bad is the wrapper solution for user code?

struct CustomHashableFoo: Hashable {
  var value: Foo
  func hash() -> Int {
    // custom hash function here
  }
}
func ==(a: CustomHashableWrapped, b: CustomHashableWrapped) {
  // custom == here
}

Really not that bad, although you do have to get 'value' in and out of
the box. It's also not reusable code—you have to rewrite the box for every
type.

I'd say you usually *don't* want to allow custom hash/== closures
because (a) then you have to store them somewhere, and (b) the compiler
won't usually be able to inline them away. You also end up with a Set<Foo>
that doesn't behave like a normal Set<Foo>—maybe you can insert
equal-but-not-identical elements—which is bad if anyone's relying on normal
Set-like guarantees.

-1 from me until we can put functions in generics, if ever.

Jordan

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution