Is this a good idea? protocol IdentifiableHashable: Hashable, Identifiable { }

Since id of Identifiable must be unique:

protocol IdentifiableHashable: Hashable, Identifiable { }

extension IdentifiableHashable {
    // just the id's hash value is enough, ignore any other properties
    func hash(into hasher: inout Hasher) {
        hasher.combine(id)
    }
}

struct Dossier: IdentifiableHashable {
    let id = UUID()
    let image: String
    let name: String
}

Question: are these two the same?

protocol IdentifiableHashable: Hashable, Identifiable { }
protocol IdentifiableHashable: Hashable & Identifiable { }
1 Like

This is related to a question I asked a while back about the semantics of Hashable. The short version is basically that a type which used the hash(into:) implementation from IdentifiableHashable to satisfy its Hashable.hash(into:) requirement would not validly conform to Hashable (unless id is the only "essential" component of that type). Notable snippets linked in the thread:

Beyond simplifying hashing, the intent of SE-0206 is to enable Swift to provide certain guarantees about its quality. In particular: as long as hash(into:) feeds enough data the hasher to unambiguously decide equality, Swift attempts to guarantee that collision attacks won't be possible . For this to work, it is critically important for Hashable implementations to include everything that Equatable.== looks at; and this is especially the case for the basic boundary types that come built-in with Swift, like Data .

While this is not a hard requirement for user code, for boundary types provided in the stdlib/SDK, we require that hashing isn't just consistent with equality, but that it's equivalent to it. The Swift test suite has checks to actively enforce this -- this is possible through repeatedly salting the hash function . "Optimizing" hashing by omitting some of the data compared by == is generally a mistake in Swift, because it completely breaks all guarantees about the strength of hashing, and opens the door to (accidental or deliberate) collision attacks.It's perfectly acceptable to hash a gigabyte of data if someone inserts some large value (such as a big collection) as a key in a hash table. Multi-megabyte String keys are easy to protect against; hidden hashing weaknesses aren't.

2 Likes

Key points I see:

So in general this is not a good idea, because == and hash must be in sync base on the same properties.

But for my struct Dossier, the image and name fields are not unique: the same values can repeat...so for this case, id is used to make them unique, if I don't have the id field, Doing SwiftUI.ForEach([Dossier]) { ... } would crash.

So what if I add ==:

extension IdentifiableHashable {
    static func == (lhs: Self, rhs: Self) -> Bool {
        lhs.id == rhs.id
    }

    // just the id's hash value is enough, ignore any other properties
    func hash(into hasher: inout Hasher) {
        hasher.combine(id)
    }
}

I'm only using these IdentifiableHashable types for SwiftUI.ForEach. For this narrow use case, it this IdentifiableHashable okay?

1 Like

Is there a reason that you can't use the compiler-synthesized Hashable conformance, which would hash id, name, and image all together?

No reason, just wanting to avoid unnecessary work if possible. In my case here, Identifiable makes a Dossier unique, not the image and name properties because those values can repeat. A benefit of doing this is I can make a SwiftUI.View that has lots View's inside Hashable very easily:

I can make my Dossier data struct into a View:

struct DossierView: View, IdentifiableHashable {
    let id = UUID()
    // these two are not Hashable, but we don't need to care
    let image: Image
    let name: Text

    var body: some View {
        ...
    }
}

Since == and hash must be "in sync", it would be nice if the language can enforce this requirement: if you implement hash(into:), you must also implement ==. Otherwise, this will break like as dictionary key?

1 Like

Identifiable has nothing to do with Hashable. It is there to expose a property which indicates whether 2 objects (which are not equal), are in fact 2 different versions of the same logical "thing".

So this:

Is the kind of thing you shouldn't do. What this is saying is that 2 Dossier objects with different images and names are interchangeable - that 2 Arrays which contain Dossiers with the same IDs are equal, even if each array contains entirely different versions of each Dossier.

let someDossiers: [Dossier] = [...]
var otherDossiers = someDossiers
for i in 0..<otherDossiers.count {
  otherDossiers[i].name = randomString()
  otherDossiers[i].image = randomImage()
}
someDossiers == otherDossiers // <- Do you expect this to return 'true'?

In the context of SwiftUI, you're telling the system that if it's already displaying the data from someDossiers, and you update it with a bunch a changes as above, that it doesn't need to refresh the display.

Same principle applies to Hashable.

7 Likes

Since I'm assigning a new UUID when each Dossier is created, this is not possible: every separately created instance of Dossier are unique, even if image and name are the same. They are == only if they are copied.

I was getting crash in something like this:

// the source of original data can have identical value, like this:
let dossiers = [Dossier(image: "same", name: "also same"), Dossier(image: "same", name: "also same")]
...
ForEach(dossiers, id: \.self) {
     SomeView($0)
}

It freeze a little, then crash, so I made Dossier: Identifiable and let id = UUID(), this fixed the crash. It seems ForEach want each of its content view's Data.Element unique is my guess and making them identifiable is the easiest way to satisfy this?

1 Like

It is almost always guaranteed to be a bug to conflate Identifiable and Equatable. Even in SwiftUI, Identifiable is used to note the same item, potentially with different states (old/new states), while Equatable is used to note whether the two states are functionally indistinguishable (no need to redraw).

ID should be unique. Or for init(_:content:), Data.Element.ID should be unique. If items at different indices are different items (and each item is at a fixed index), then the index is your identifier.

2 Likes

Okay, so this IdentifiableHashable and its defaults are a bad idea.

I just get rid of my IdentifiableHashable and just do Hashable, Identifiable and use the synthesized == and hash.

Yes, this is the case

Does it make any different if I use UUID? This way, I can just assign a new UUID internally when I init a Dossier. If not, I'll have to pass in the index value to Dossier.init().

It's a big fat depends. The question is, what makes two Dossiers refer to two different entries? If I am given two Dossier values, what make me know you're referring to two different items, and not just two states of the same item. Can the object be moved outside of the array containing them? Would you need to still keep track of those objects? Those are (one of) the cases where UUIDs and plain old indices start to differ.

While I don't really have any useful opinions on the main point you raised in your post, I can answer this question. :slight_smile:

The answer is yes. A single inheritance clause with a protocol composition is equivalent to multiple inheritance clause entries where each one is a single protocol.

There's also an interesting useless bit of trivia here. Could we hypothetically deprecate the , syntax for writing out multiple inheritance clause entries and always have a single one which might be a protocol composition? The answer is almost, but not quite, unfortunately. The one exception is that enums can declare a RawRepresentable conformance by "inheriting" from their raw type, eg

enum Foo : Int, SomeProtocol {
  case a = 1
}

Since Int is neither a protocol nor a class, it cannot appear in a protocol composition, so it would not be valid to write enum Foo : Int & SomeProtocol.

13 Likes

I think UUID, since it's "universally unique" (what does this mean? unique everywhere or within this device?), is generally good to use as an id.

Where as array index is only unique within this array, it only good within the scope of the array and it's only good if the array doesn't change. if the array elements can be added or deleted, then the index is not good as an id.

I will just use UUID for id.

If you have a data source that you can use to identify data (like username of the dossier owner), you can also use that. It may be easier to keep track than UUID, idk, :woman_shrugging:.

It just a random 128-bit number*. You can usually assume uniqueness, even across devices. Say, you generate 1000_000_000_000 entries, the chance of collision would sit at around 0.00000000001%.

OTOH, it is quite overkilled for small use cases.

* The version used by Swift (Version 4) has only 122 bits of actual information, I believe.

Everything else is created by the user and they can enter identical data. So those are not unique.

The id's are used:

  1. persist via CloudKit and loaded to different devices
  2. for local notification and navigate to show this record

UUID seems to be the best option.