Protocol as Dictionary Key or Set element

Can someone explain to me the problem of having a protocol that implements Hashable as a dictionary key or as the element of a set?

like

protocol MyProtocol: Hashable {
    var id: Int { get }
}

Any type that implements MyProtocol will have to implement Hashable, therefore Set or Dictionary will have access to hash(into:) or Equatable overrides.

There's no problem having a protocol that refines Hashable.

I guess my problem description was not clear. My Bad.

protocol MyProtocol: Hashable {
    var id: Int { get }
}

var dict = [MyProtocol: MyObject]() // This is not allowed

But why it's not allowed? I had work with C# for many years, and I can do this with Interfaces.

1 Like

Ah, sorry, my bad, I misread the question.

I believe it boils down to this:

struct Apple: MyProtocol {}
struct Orange: MyProtocol {}
Apple() == Orange() // can't do that in Swift

I know that it is something other languages allow (the result of comparison of instances of different types is false (obviously)), it's just not something that Swift allows.


If you really want to compare apples and oranges you kind of could
extension Equatable {
    func isEqual(to other: any Equatable) -> Bool {
        if let otherSelf = other as? Self {
            self == otherSelf
        } else {
            false
        }
    }
}

infix operator ==== : ComparisonPrecedence
infix operator !=== : ComparisonPrecedence

func ==== (lhs: any Hashable, rhs: any Hashable) -> Bool {
    lhs.isEqual(to: rhs)
}

func !=== (lhs: any Hashable, rhs: any Hashable) -> Bool {
    !(lhs ==== rhs)
}

Usage:

struct Apple: Hashable { var id: Int }
struct Orange: Hashable { var id: Int }

let apple = Apple(id: 1)
let apple2 = Apple(id: 2)
let orange = Orange(id: 1)

precondition(apple ==== apple)
precondition(apple !=== apple2)
precondition(apple !=== orange)

You can use AnyHashable for heterogeneous dictionaries/sets.

It doesn't support additional constraints, so you'd need to make your own AnyMyProtocol type if you need stricter enforcement that all of the types conform to your protocol.

The reason is that Swift does not support protocol conformances for structural types. We cannot express that a tuple of Equatable elements is Equatable, and similarly we cannot express that an existential wrapping an Equatable value can itself be an Equatable value.

So you need to create a nominal type which does the same thing (e.g. Tuple3<X, Y, Z>, or AnyMyProtocol), because Swift does let you write protocol conformances for those.


Edit: That being said, there are significant usability issues with allowing 'existential self-conformance'. While it is useful to have the capability, it is almost always preferable to use the wrapped type's own conformance instead of the existential's one. For example, if you have an array of integers wrapped in an existential, you still ideally want generic code to use Array<Int>'s conformance to Collection, not any Collection<Int>'s one.

But that can be an issue when they behave differently. For example, an existential any Hashable should not just pass through its wrapped value's hash value unchanged -- it should mix in the dynamic type of the value, to prevent hash collisions. But will it always be clear which conformance is being used, and will it always be clear to developers which conformance they need?

x.hashValue == (x as any Hashable).hashValue // false
6 Likes

In addition to what has been said, maybe an explanation from a more conceptual angle.

Protocols are not types. This sounds obvious, but there's more weight to it than what especially newcomers to Swift can see.
Part of the problem is that the syntax (still) allows protocols to be used where usually types go, e.g. when you declare a variable: var someVariable: MyProtocol. This makes the protocol look like a type, because, well, it's used where usually you'd expect a type identifier. But, as said, a protocol isn't a type, so what type does someVariable have then? The more modern syntax makes it clear: var someVariable: any MyProtocol. Yes, that any MyProtocol is a type, we call that an existential. It's basically a box to erase whatever concrete type a value you assign to this variable has.

Now, the dictionary you try to create in your example has the type [MyProtocol: MyObject]. Generics require a concrete type to be given for their type parameters (called Key and Value, in this case).
However, MyProtocol is not a type, so if all this were to work like it does for variables, the actual type of our dictionary would be [any MyProtocol: MyObject]. So far, so good, but now we run into a problem. Dictionary constrains the Key type to Hashable, meaning any type that is passed for this type parameter has to conform to Hashable. And while the protocol MyProtocol does refine Hashable, that only means that any type conforming to MyProtocol also needs to conform to Hashable.
Alas, the type any MyProtocol does not conform to Hashable. In fact, it doesn't even conform to MyProtocol.

You might think "well, but it should conform!", but that's easier said than done. How would it do that? Back when I learned about this I immediately thought "It should just defer all the protocol logic to whatever type it wraps", i.e. use the instance that's boxed in it and call methods and properties of that to satisfy its own conformance to the protocol.
However, that opens a terrible can of worms and Hashable is a great example of this. any MyProtocol can hold heterogenous types. What if the Hashable implementations in this are incompatible, i.e. result in lots of has collisions? This is admittedly convoluted here, but the point is that there is no easy, "right" way to do this.

Ideally we would be able to explicitly conform any MyProtocol to protocols (including MyProtocol itself), but I guess at this point it's perhaps easier to just define your own type erasing helper type like @Karl mentioned.

10 Likes

it should just work :-)

Great backstory, thank you.

BTW, this reasoning alone is not quite strong... we could equally say "let's not have a customisable hash implementation possible at all, as it could result in lots of collisions if done improperly" (or actually worse: it could result in a broken hash/EQ invariant if done improperly).


To circle back why it works in C# but not Swift: in C# Dictionary, etc call Equals(object), not Equals(Key). The equivalent of the latter in Swift is "==". We don't have the equivalent of the former, the closest thing would be NSObject's isEqual.

3 Likes

Although I never find myself wanting to do this, and I think I understand conceptually why it doesn't work, in Swift, this was also not enough to help me understand why it does work in C# (maybe understanding the difference would help the OP with the Swift):

C# – compiles:

interface I { }
struct S1: I { }
struct S2: I { }
Dictionary<I, int> dictionary = new() {
  { new S1(), 1 },
  { new S2(), 2 }
};

Swift – doesn't compile:

protocol P: Hashable { }
struct S1: P { }
struct S2: P { }

// Heterogeneous collection literal could only be inferred to '[AnyHashable : Int]'; add explicit type annotation if this is intentional
let dictionary = [S1(): 1, S2(): 2]

Just consider Obj-C version of that, if you know it better: everything is NSObject, everything has hash and isEqual, you could compare everything with everything even when that's of a different type (the result of that could be surprising, e.g. [a isEqual:b] could be a completely different implementation than [b isEqual:a]). I guess the same holds for C#.

I don’t think that this sounds obvious in terms of how the code is perceived by (probably) majority of Swift users. Even knowing that and enforcing any isn’t really helping in day-to-day use. For me there is a still open question of a good mental model around existentials so that you don’t keep reminding yourself that this is not a type.

I might not have been clear enough, but I wasn't simply talking about a programmer error or a bug.
Hashable in the context of its constraint for the Key type parameter of Dictionary has a certain semantic: "This one type, which is to be used as a key, provides its own hash implementation you, the dictionary, can use to implement your key-value mapping". This is basically what a protocol is for: It provides generics with information about as of yet not known, concrete, singular types. If the generic has requirements for several types, it needs several type parameters with constraints.
Once you use an existential, it opens the door of allowing relations, also constraints, if you will, to "sneak in" between types that are, from the point of view of the generic type, unrelated.
The "lots of collisions" wasn't alluring to a bad hash algorithm being used. It would be possible to have two types, both conform to Hashable in a proper way, but whose hashes are only valid (i.e. unique) over all instances of the type. This would mean using them as Key in Dictionary individually would be totally fine, but once you circumvent homogenity for a concrete dictionary instance by allowing sets of objects of both types to serve as keys, you could get collisions.

What the generic, Dictionary in this case, wants to express is "I want one type to be used for the keys and this type shall have a proper Hashable conformance". That's why any Hashable does not work: It simply doesn't fit what the dictionary type expresses. Or, if you want to phrase it the other way around: any Hashable does not express "Types that have a proper Hashable implementation working over all of them", it expresses "Types that each individually conform to Hashable".
I mean, of course that's a design choice by Swift and it could mean the first thing, but imo that would be a bad choice and probablt require a completely different understanding of protocols and existentials in general. In a single module you might not even "know" all possible types that conform to a protocol, how would you enforce what it means that all of them have to play nicely together? Often times that's an internal implementation detail of the types.
Also, keep in mind that what this means depends on the protocol. For Equatable, for example, you might decide that it's generelizable enough to define what it means if objects of two different types are equal (probably "no, they aren't"), but for other protocols that's a lot less likely.

Yeah, it is not an easy concept and I don't know how this can be made more approachable. But I think that we can still simply write the protocol name to denote an existential is unfortunate. It definitely doesn't help to distinguish the concepts, though, so I hope that in some future version of Swift, any is enforced, maybe.

4 Likes

This is very true... Just why is that (an increased chance of hash collisions when storing heterogeneous dictionary keys or set elements) not a problem in C# and Objective-C?


As a side note: couldn't we choose hash salts per type, so that:

struct A: Hashable { var x = 1 }
struct B: Hashable { var x = 1 }
A().hashValue != B().hashValue // not current Swift

That approach does not work correctly in the presence of subclasses:

class A: Hashable {
  static func == (lhs: A, rhs: A) -> Bool { true }
  func hash(into hasher: inout Hasher) {}
}

class B: A {}
class C: A {}

let b = B()
let c = C()

print(b == c)   // true
print(b ==== c) // false

I'd say that B is not ==== C is quite rightful thing to do...


Equatable doesn't play well with classes in the first place...

class A: Equatable {
    var x = 1
    static func == (lhs: A, rhs: A) -> Bool {
        lhs.x == rhs.x
    }
}
class B: A {
    var y = 2
    static func == (lhs: B, rhs: B) -> Bool {
        lhs.x == rhs.x && lhs.y == rhs.y
    }
}
class C: A {
    var z = 3
    static func == (lhs: C, rhs: C) -> Bool {
        lhs.x == rhs.x && lhs.z == rhs.z
    }
}

print(B() == C())   // true?!

IMHO the greatest issue is having == (all operators actually) being static methods instead of instance (like in C++).

Because they implement their key-value collection types differently?
I'm not arguing against different approaches or that for the specific case of a generic dictionary-like type there might be designs that allow heterogenous keys. I was focusing on the underlying issue of specifically using protocols for Keys in dictionaries (or as Set elements, which has the same underlying problem).
As I tend to do, I try to focus on mental models to understand what certain Swift syntax means. What is actually expressed with them.
In this case I see that the syntax lends itself to be understood in a way that clashes with what it actually means and I wanted to point out how that comes. This then goes beyond dictionaries, hashability, or equatability. It's a result of how protocols and existentials work.

Of course there is no doubt that an easier way to express heterogenous keys for Dictionary is, in a way, a casualty of Swift's general syntax in this case, but personally I believe that was a good tradeoff and the reason for it can be understood easily enough.
A custom type-erasure AnyMyProtocol type can better express that then, or a custom key-value collection type that handles this differently perhaps.

1 Like

You can use AnyHashable in Swift, so to be fair it is not a the problem in Swift, but just one of the perspectives.

The problem in Swift is that protocols just built differently comparing to Java/C#, for example. ObjC probably out of the picture here because it inherently has more dynamic approach to this.

I think we all get confused because for years everyone has been using protocols as types freely (with just few restrictions), and even right now with introduction of any, it is not being enforced (really miss that this has been postponed), and when you do enable it as a feature — it still has less sense than could’ve been if that spelling was from the start.

1 Like

Thank you!

interface MyHashable1 {
	abstract bool instanceEqual(MyHashable1 rhs);
	abstract int hashValue();
}
interface MyHashable2 {
	static abstract bool staticEqual(MyHashable2 lhs, MyHashable2 rhs);
	abstract int hashValue();
}

class Program {
    static void Main() {
    	var a = new Dictionary<string, MyHashable1>(); // ✅
	    var b = new Dictionary<string, MyHashable2>(); // 🛑 Error interface 'MyHashable2' cannot be used as type argument
	}
}
2 Likes

Have a look at this "existential oriented" dictionary design:

existential oriented dictionary
extension Equatable {
    func isEqual<T: Equatable>(_ other: T) -> Bool {
        guard let other = other as? Self else { return false }
        return self == other
    }
}
struct FooDictionary {
    var items: [(any Equatable, any Hashable)] = []
    
    func containsValue(_ value: any Hashable) -> Bool {
        // simple imlpementation, ignores hashValue
        for v in items {
            if v.1.isEqual(value) {
                return true
            }
        }
        return false
    }
    
    subscript(_ key: any Equatable) -> (any Hashable)? {
        get {
            // simple imlpementation, ignores hashValue
            for v in items {
                if key.isEqual(v.0) {
                    return v.1
                }
            }
            return nil
        }
        set {
            // simple imlpementation, ignores hashValue
            for (index, v) in items.enumerated() {
                if key.isEqual(v.0) {
                    if let newValue {
                        items[index] = (key, newValue)
                    } else {
                        items.remove(at: index)
                    }
                    return
                }
            }
            if let newValue {
                items.append((key, newValue))
            }
        }
    }
}

func testFoo() {
    struct Test: Hashable {
        let x: Int
    }
    struct Other: Hashable {}

    let a = Test(x: 1)
    let b = Test(x: 2)
    var dictionary = FooDictionary()
    dictionary["a"] = a
    dictionary["d"] = Other()
    print(dictionary.containsValue(b))   // false
    print(dictionary["b"] != nil)   // false
    dictionary["b"] = b
    print(dictionary.containsValue(b))   // true
    print(dictionary["b"] != nil)   // true
}
testFoo()

Its items are "hardcoded" to any Equatable and any Hashable, so it doesn't quite reach the goal of restricting its values to some MyProtocol.