SE-0261: Identifiable Protocol

zzt4 · July 15, 2019, 7:37pm

Personally, I was surprised to see a hardcoded id variable as part of the Identifiable protocol. I much prefer the flexibility offered by Paul Hudson's approach. He uses a KeyPath for Identifiable conformance, offering flexibility around the name of the identifier. For a Person, we could use a variable called ssn instead of id, and for a Book we could use isbn. Here's the example from his approach:

protocol Identifiable {
    associatedtype ID
    static var idKey: WritableKeyPath<Self, ID> { get }
}

This could also prevent having to write some computed properties just for the sake of Identifiable. In situations where the id property could cause a conflict, a developer would now have more options.

I understand this could hurt the learning of Identifiable, as developers will need to learn about KeyPaths in order to start using it. Are there some reasonable defaults we can provide to decrease the knowledge space but still leverage this flexibility?

masters3d · July 15, 2019, 9:27pm

I really like this.

protocol Identifiable {
 associatedtype ID: Hashable
 static var idKey: WritableKeyPath<Self, ID> { get }
}
extension Identifiable {
 func idKey() -> ID {
    return self[keyPath: Self.idKey]
 }
}

struct Person: Identifiable {
 static let idKey = \Person.socialSecurityNumber
 var socialSecurityNumber: String
 var name: String
}
let taylor = Person(socialSecurityNumber: "555-55-5555", name: "Taylor Swift")
print(taylor.idKey().hashValue)

sharplet · July 15, 2019, 9:32pm

Why is the key path approach preferable to this?

struct Record {
  var uuid: UUID
}

extension Record: Identifiable {
  var id: UUID { uuid }
}

This seems much more straightforward and understandable to me than using a key path for indirection.

Karl · July 15, 2019, 10:19pm

If we went with KeyPaths, I would prefer if the property was named primaryKey (or, perhaps more controversially, just the key).

ebg · July 15, 2019, 10:53pm

This proposal is defintely getting fileprivate'd, :( K.I.S.S. and +1 to the existing proposal as is.

zzt4 · July 16, 2019, 12:26am

By using a KeyPath, the id can be named anything the developer chooses and pointed to using a KeyPath. This avoids having two properties like in your example. Should your code use the uuid property or the id property? This could lead to a fragmented code base where sometimes the id property is used and other times the uuid property is used.

Your given example using KeyPaths could be written as below:

struct Record: Identifiable {
  static let idKey = \Record.uuid
  var uuid: UUID
}

jawbroken · July 16, 2019, 1:57am

It doesn't avoid having two properties at all, it just makes the second property a confusing keypath. And, in fact, it forces you to have two properties instead of one in the common case where you're happy to just name your ID id.

I really don't like the concept of having all this indirection and confusion just to avoid settling on a name. Would Hashable be better if used a keypath to find an arbitrary property instead of just having hashValue? Should every standard library protocol introduce some form of indirection for its properties and functions?

QuinceyMorris · July 16, 2019, 2:24am

After thinking about this further, I'm hugely against this proposal.

The default implementation is a loaded weapon set to backfire. It does not take the lifetime semantics of the id values into account, and is just as likely to be wrong about that instead of right. The id might need to be globally unique during the current program execution, or it might need to be globally unique across executions. (This could be fixed by not having a default implementation.)
Putting Identifiable in the standard library invites code to use it, of course. It's trivially easy to imagine (say) two 3rd-party libraries, each requiring Identifiable conformity for the same objects or values, that impose different conformance requirements (such as different associated types, or incompatible identity rules or lifetimes).

Once there are two pieces of code that impose different conformance requirements, the usability of Identifiable breaks down. If those two pieces of code are different 3rd party libraries, the libraries become irretrievably incompatible.

IMO, the dangers of blessing Identifiable as a unique standard far outweigh the benefits.

xwu · July 16, 2019, 2:32am

This would be an argument against including any protocols in the standard library. Vending a conformance to a protocol you don’t own by a type you don’t own is not supported and is liable to break at any time. It’s not currently forbidden by the compiler, but really at some point it should be at least a warning.

QuinceyMorris · July 16, 2019, 3:36am

I think there's a difference between the proposed Identifiable and (say) the existing Equatable.

Equatable is how you provide your type's notion of equality as uniquely required by the standard library itself. Specifically, it's the behavior of == equality, as opposed to other potential equality-like behaviors such as === or ~=.
Identifiable is how you provide your type's notion of identity as required by a particular 3rd party library (or other code "client"). For example, SwiftUI has been stated as a potential adopter of this protocol. How do you square that particular conformance with one in another library?

My argument isn't against including any protocols in the standard library. It is, perhaps, an argument against including any protocols whose meaning isn't also specified (more or less well enough) by the standard library, and whose usage in that meaning isn't privileged by the standard library against other contenders.

FWIW, I think the proposed Identifiable is more a metaprotocol than a protocol. It's a schema of what protocols in a family of identifiability protocols would look like.

RJ_Clegg · July 16, 2019, 7:06am

+1
I'm all for this to be added to the standard library.

Jean-Daniel · July 16, 2019, 8:28am

Looks like you really don't get what hashValue is. Please take the time to learn what it is and how it work.

You can't use hashValue as an identifier as there is no guarantee that it will be unique. Many completely unrelated value/records can have the same hashValue, and this is perfectly valid and expected.

Jean-Daniel · July 16, 2019, 8:39am

associatedType should not be an issue, as properly coded library should not impose the identifier type. For instance, in SwiftUI, you can use whatever you want.

zzt4 · July 16, 2019, 12:49pm

Yes, but there’s only one property on the instance of the struct. The other one is static so it can’t be accessed on an instance. So as far as the instance is concerned there is only ever one property. With the proposed implementation some structures will have two properties: the actual property used for identification (i.e. ssn) and a computed property needed for Identifiable. The unfortunate extra step is why I reached out to the community for defaults that would make this easier to use in the majority case.

CTMacUser · July 16, 2019, 1:36pm

(Haven't read the other responses yet. ... Ooh, I just realized I'm too late.)

It looks like Hashable with a layer of indirection. I can see a need where you want to exploit the Equatable/Hashable ecosystem with a type that shouldn't directly model E/H. But I think the concept is more niche than general/fundamental.

I know it breaks theming, but could the "ID" associated type be renamed to something like "HashProxy"? The word "ID" is bad for a programming name since it's an acronym (outside of psychology). Yes, I know we have things like "HTML" and "URL," but "ID" just looks weird to me. Hmm, maybe "Identifier"?

Proposal evaluation: +0.5
Problem significance: +0
Feel & direction fit: yes
Other languages: n/a
Effort: quick reading

Ponyboy47 · July 16, 2019, 2:28pm

Given the earlier discussions where it was shown that both id, identifier, and identity (the 3 suggested attribute names) all have a high potential for collision somewhere in the wild, I think that using a KeyPath to have the id use a developer-defined property name is a good middle ground. All 3 of the suggested attribute names are used in thousands of places and none of them would be guaranteed to meet the requirements for Identifiable in all of their usages.

As the protocol stands today, many people would be defining id as a computed variable that just points to a different property anyways and so using a KeyPath would not be any different for this group of people. Those who have no id property or whose existing id property already meet the requirements would indeed have to add a new static variable which would make their code nearly identical to the people in the first group. Where the KeyPath route really shines is with those whose id property does not meet the requirements of Identifiable. A KeyPath lets the people in this group use the protocol without having to refactor their existing id property.

As far as the argument to why not use a KeyPath with Hashable and/or Equatable:

Hashable and Equatable tend not to have any collisions in the wild because they have both been around since the beginning of swift and are used similarly in many languages. Their requirements are typically not overloaded terms or functions with multiple intentions (how weird would it be to define == that doesn't mean equals?). As such, I think comparing using a KeyPath here to using it in Hashable or Equatable is an apples and oranges comparison. Although it is an interesting idea which may be useful in other areas where coming up with a single clear term to mean something does not necessarily fit or where consensus cannot be reached.

zzt4 · July 16, 2019, 3:16pm

I have discovered a way for both the KeyPath style and the simplicity of the proposal as written to co-exist. This offers developers some extra flexibility. Developers can opt into IdentifierProvider if they want the extra flexibility, or stick with Identifiable if they don’t need it. The consumers of the protocol will only want to use the higher-level IdentifierProvider protocol to support a less-specific type. So there could be some challenges where a library might ask for Identifiable when it actually only requires IdentifierProvider. Maybe this is reason enough to not support having two protocols to solve this problem? I think requiring the extra typealiases might be a better trade off than the issues around using two protocols.

protocol IdentifierProvider {
    associatedtype ID: Hashable
    static var idKey: WritableKeyPath<Self, ID> { get }
}

protocol Identifiable: IdentifierProvider {
    // not sure why re-defining ID works here, without it the compiler complains
    associatedtype ID: Hashable
    var id: ID { get set }
}

extension Identifiable {
    static var idKey: WritableKeyPath<Self, ID> { return \Self.id }
}

Now we can write simple identifiables and KeyPath ones:

struct SimplePerson: Identifiable {
    var id: String
}

struct Person: IdentifierProvider {
    static let idKey = \Person.socialSecurityNumber
    var socialSecurityNumber: String
    var name: String
}

We can re-hash the names at any point I just want to make sure that KeyPath is part of the discussion.

Karl · July 16, 2019, 3:29pm

I do feel that this is quite a fundamental, well-grounded concept. From Wikipedia:

An object in object-oriented language is essentially a record that contains procedures specialized to handle that record; and object types are an elaboration of record types. Indeed, in most object-oriented languages, records are just special cases of objects, and are known as plain old data structures (PODSs), to contrast with objects that use OO features.

So, going back to basics, the idea of having data structures to represent a Person or House is quite universal. It doesn't really matter if the thing is a struct or a class - the distinguishing feature is that it holds some model information (as opposed to, say, NSProgress, which is primarily used as a communication channel, or NSManagedObjectContext, which manages state).

Keys
A record may have zero or more key s. A key is a field or set of fields in the record that serves as an identifier. A unique key is often called the primary key , or simply the record key . For example an employee file might contain employee number, name, department, and salary. The employee number will be unique in the organization and would be the primary key.

So the notion of having particular keys with unique values is again quite universal among record-types. This is reflected in database programming, where best practice is usually that every table should have a primary key (there are some exceptions, but they are very rare). It is a property of the data you are modelling, not the object; for example, if your data does not intrinsically have any unique keys, you can't just invent one (see the problems we have discussed about the default implementation for classes for an example).

What's more, the name "key" is a term-of-art that we already use in Swift (e.g. KeyPath). Using a KeyPath in the protocol definition is perhaps a little bit ugly, but that's really the only negative thing I can say about it. There is a certain elegance to it from a terminology perspective.

CTMacUser · July 16, 2019, 3:36pm

Something like a post I made a few months ago?

masters3d · July 16, 2019, 4:29pm

@Jean-Daniel , you know Git uses hashes as the commit ID? I edited the example.