Something being hashable allows us to compare between objects and value types with out explicitly declaring equality of every conforming type. Loosing that ability to just check the hash to see if the objects and values are equal just weakens the reasoning to have it in the standard library.
Checking the hash tell you if objects are different, not if they are equals. Hash equality don't imply object equality.
That's why Hashable
inherits from Equatable
. You must be able to compare the objects to check if they are equals when hashes are equal.
How about a case like that we design an ID
like this? The ID
has to be Comparable
and doesn't have to be Hashable
. I don't want to think about what is the best implementation of hash(into:)
for the ID
just to silence the compiler. Of course, we can add : Hashable
to the ID
to synthesize hash(into:)
automatically. However, when we provide a library which includes the ID
and users of the library also want to use the ID
as keys of hash tables, they cannot provide the best implementation of hash(into:)
for the ID
instead of the automatically synthesized one.
It may seem a corner case. But we cannot foresee everything. I think it is more deliberate to chose the minimal one which we really need.
I don't understand. The IDs mentioned in this link are 64-bit integers.
The Instagram link is just a 64-bit number that could trivially be Hashable
. Sure, you can always say that almost no protocol should inherit from any other, because users can just require conformance to both protocols in a typealias or generic signature (e.g. Hashable
itself doesn't technically need to imply Equatable
, you could make everyone write Equatable & Hashable
) but that puts an extra burden on users. So you need to weigh up the complexity of using the protocol vs the complexity of conforming to it. In this case, I don't feel like Hashable
conformance is a significant burden on top of Equatable
conformance, especially because both can be synthesised in basically the same way, and it greatly increases the flexibility of the ID.
This will not change. Struct instances themselves will continue to not have identity. What Identifiable
does is recognize that many struct instances represent a snapshot of the state of an entity which does have a persistent identity. The id
property correlates the snapshot with the entity and allows it to be distinguished from snapshots of state of other entities.
This makes no sense whatsoever to me. The identity provided for by Identifiable
often will not be object identity. Several commenters have suggested dropping the defaultl where Self: AnyObject
in order to emphasize that this protocol is not strictly about object identity.
Agree
This is an interesting possible direction.
This proposal intentionally does not address this problem. There are a number of ways to approach the problem. All of the ways I'm aware of are compatible with the proposal in its current form.
The reason is that Hashable
is very commonly useful when working with identity values and it is difficult to imagine a type that can be used as an identifier for which a Hashable
conformance is an onerous requirement. Placing the requirement on the protocol eliminates the need for writing out the constraint at a lot of usage sites.
Comparable
can be useful on identifiers for technical purposes such as tree storage. But identifiers usually do not have any inherent semantically meaningful notion of ordering. For this reason it is reasonable to omit a Comparable
conformance on an identifier type if it is known that Comparable
is not necessary (for use in trees, etc).
The associated type is essential. Different identifier types are necessary in different contexts. In particular, many people prefer to use strongly typed identifiers which prevent incorrect use of for example an ID<Person>
where an ID<Company>
is required. I have seen production bugs caused by this kind of accidental misuse.
The point you raise about existentials is a well known language limitation. There has been enough discussion about lifting it recently that I am optimistic it will happen before too long (even if it takes more time to fully flesh out constraints on existentials). Instead of changing the design of the protocol we should focus on lifting the language limitation so that the protocol would be viable for Combine.
There is no way for a protocol to prevent this. If it were possible to define a protocol requirement let id: ID
that would have been considered but unfortunately it isn't (yet).
As soon as the above mentioned language limitation around existentials is lifted this will no longer be an issue. Is there anything you can do now that would position Combine to adopt the protocol in the future when the language limitation is lifted (or at least leave that door open)?
The code sample you posted above isn't actually relying on existentials - it uses a generic constraint. If that is representative of what you need to do, maybe you could do this:
// or extension Foo, and possibly private in either case
func hasBeenSeen<F: Foo>(_ item: F) -> Bool {
let id = AnyHashable(item.id)
guard !seenItems.contains(id) else {
return true
}
seenItems.insert(id)
return false
}
It's possible that this wouldn't meet your performance requirements or that you really do need an existential elsewhere. But at least in this example you're not facing a hard limit in the type system.
What is the reason this signature chooses to use an existential instead of generics?
This review has been supremely useful in clarifying the semantics of the proposed protocol. It's not enough to say that the protocol is "not strictly" about object identity; as @Karl clarifies, it's in fact strictly not about object identity. Unless I'm again mistaken, the only type for which identity as defined by this protocol would be semantically coincident with object identity would be one where the state that's modeled is machine memory itself.
The reason that I (and I'm guessing others) have misunderstood the proposal is due to the proposed default implementation. By declaring that the default identity for the purposes of Identifiable
is the object identifier for all reference types, the proposal yokes the two concepts together. Your statement that "the identity provided for by Identifiable
often will not be object identity" is entirely a repudiation of that default implementation; the two simply cannot be reconciled.
My opinion is not to add Comparable
to the requirement of ID
. I think it is also a technical purpose that we want ID
be Hashable
. We want it because we have Dictionary
in the standard library which is implemented using hash tables and common in Swift now. If it was common in Swift to use dictionaries implemented using trees, like Haskell, would you propose that ID
should have the requirement to conform to Comparable
for convenience?
I don't understand. The IDs mentioned in this link are 64-bit integers.
The Instagram link is just a 64-bit number that could trivially be
Hashable
.
What I wanted to mean was
- The IDs have encoded information into 64-bit data.
- The data have different characteristics from 64-bit integers.
- So the IDs have different optimal implementation for uniformity.
- Just adding
: Hashable
does not necessarily synthesize the optimal one. - In some cases like one I referred to, adding
: Hashable
to silence the compiler prevent to provide the optimal implementation for users.
In practice, I think handling the IDs as 64-integers and adding : Hashable
to synthesize hash(into:)
automatically works enough. But if ID
does not have the Hashable
requirement, we don't have to care it anyway. As I mentioned first that I had no concrete ideas of cases that the Hashable
requirement became a problem in practice, I know it is an awkward example. But I am not sure if there are really no cases that adding Hashable
becomes a problem because I can't foresee everything.
It is a kind of minimalism. I don't want to make types conform to needless Hashable
. I prefer being minimal because it may cause some problems in the future in long term. Identifiable.ID
in SwiftUI is OK. But the standard library is universal. I want it to be kept minimal.
Thank you all to keep trying to understand what I think from my poor English
My opinion is not to add
Comparable
to the requirement ofID
. I think it is also a technical purpose that we wantID
beHashable
. We want it because we haveDictionary
in the standard library which is implemented using hash tables and common in Swift now. If it was common in Swift to use dictionaries implemented using trees, like Haskell, would you propose thatID
should have the requirement to conform toComparable
for convenience?
If Comparable
was a pervasive constraint used for purely technical reasons in Swift then that would certainly be a consideration. But it isn’t so that’s a hypothetical.
One important distinction is that Hashable
is always about technical concerns where Comparable
often has semantic meaning in the non technical domains that are represented by our data.
It is a kind of minimalism. I don't want to make types conform to needless
Hashable
. I prefer being minimal because it may cause some problems in the future in long term.Identifiable.ID
in SwiftUI is OK. But the standard library is universal. I want it to be kept minimal.
From the perspective of minimalism, if you don’t need Identifiable
you will not be adding that conformance either. If you do need it there must be a reason. What generic code do you want to write with an Identifiable
constraint where the ID
is only required to conform to Equatable
but not Hashable
(or Comparable
if you replace hash tables with trees)?
What generic code do you want to write with an
Identifiable
constraint where theID
is only required to conform toEquatable
but notHashable
(orComparable
if you replace hash tables with trees)?
How about the following example?
protocol Identifiable {
associatedtype ID: Equatable
var id: ID { get }
}
struct Table<Value> {
...
}
// when the table has a primary key
extension Table where Value: Identifiable, Value.ID: Comparable {
func selectedValue(wherePrimaryKeyIs key: Value.ID) -> Value? {
...
}
...
}
One important distinction is that
Hashable
is always about technical concerns whereComparable
often has semantic meaning in the non technical domains that are represented by our data.
I agree with it. So associatedtype ID: Hashable
is permissible for me while associatedtype ID: Comparable
is not although I prefer associatedtype ID: Equatable
because Identifiable
sematically means just that values can be identified by their id
s and equality of id
s is enough for the semantics.
How about the following example?
Your example shows a signature but not an implementation. How do you plan to provide an efficient implementation using only Equatable
?
Your example shows a signature but not an implementation. How do you plan to provide an efficient implementation using only
Equatable
?
I didn't intend to provide an efficient implementation using only Equatable
. I mean it is good to make it possible to choose an appropriate one, Hashable
, Comparable
or other one, depending on a case.
I intended to use trees for an efficient implementation in the case I showed as implied in the requirement Value.ID: Comparable
. I chose it because the Table
type represents a table of databases whose indices are usually implemented using a kind of trees.
Also it is possible to think about an example of more flexible usages of Identifiable
for tables (some database systems permit to use hash tables for indices instead of trees).
protocol Identifiable {
associatedtype ID: Equatable
var id: ID { get }
}
protocol TableProtocol {
associatedtype Value: Identifiable
func selectedValue(wherePrimaryKeyIs key: Value.ID) -> Value?
...
}
struct Table<Value: Identifiable>: TableProtocol where Value.ID: Comparable {
func selectedValue(wherePrimaryKeyIs key: Value.ID) -> Value? {
...
}
...
}
struct HashTable<Value: Identifiable>: TableProtocol where Value.ID: Hashable {
func selectedValue(wherePrimaryKeyIs key: Value.ID) -> Value? {
...
}
...
}
struct SingletonTable<Value: Identifiable>: TableProtocol {
private var value: Value
init(value: Value) {
self.value = value
}
func selectedValue(wherePrimaryKeyIs key: Value.ID) -> Value? {
guard value.id == key else { return nil }
return value
}
...
}
Neither Hashable
nor Comparable
are required for the last one, SingletonTable
.
I intended to use trees for an efficient implementation in the case I showed as implied in the requirement
Value.ID: Comparable
. I chose it because theTable
type represents a table of databases whose indices are usually implemented using a kind of trees.
I see, I must have misread your post. I still think the Hashable
constraint is warranted.
No significant downsides to including it have been articulated. The closest example is a not-necessarily-optimal Hashable
conformance (usually synthesized by the compiler) that may not be used. The downside to omitting Hashable
is having to write out an additional constraint at many usage sites. This can make signatures more difficult to understand, especially for programmers who are less familiar with generics.
On balance, I think the benefit of including it outweighs the relatively minimal cost.
a == b // value equality a === b // object identity a ==== b // record identity collectionA.difference(from: colllectionB, by: ====) collectionA.startsWith(colllectionB, by: ====)
The thing about this is that if two live objects have the same reference identity/memory address (by ===
), they must also be considered substitutable (by Equatable's ==
). Swift does not allow any kind of funky aliasing that could lead to two references to the same address being considered different by Equatable's semantics.
In other words, if a === b
, then a == b
in all valid programs.
Meanwhile, the purpose of this protocol is that two objects that are not considered substitutable (i.e a == b
might be false) may have the same record identity (i.e. a ==== b
is true).
In other words, if a ==== b
, then a == b
may or may not be true.
A record identity operator might be a useful shorthand, but it also has the potential to be confusing given this difference.
This is a great addition to the standard library!
I think id
for the variable name is a practical tradeoff between clarity and brevity. There is so much prior art that it shouldn't be ambiguous, and it would be used quite often in code that uses this protocol.
I do agree, though, that the associatedtype
should be spelled Identifier
. The associatedtype
itself will be referenced sever-orders-of-magnitude less often than the id
property, so it seems less important to prefer brevity.
Perhaps:
protocol Identifiable {
associatedtype Identifier: Hashable
var id: Identifier { get }
}
There is one gotcha that seems a bit cagey. The protocol allows the
id
to potentially be re-assigned or re-generated. Consider the following usage:struct Contact: Identifiable { var id: Int { generateID() } var name: String }
That would mean that any access to
id
would return a new generated identifier. This would probably be really bad. Furthermore the protocol allows the var to be assignable too (which has the same failure mode as returning a hash). These objections should not be considered as a blocking type of objection but more-so something that should be considered imho.
FWIW, CoreData does this. Create a new instance of an entity, and the objectID
will be some temporary ID. Save the context, and that object now has a new, permanent objectID
.
Actually, while testing this out I noticed CoreData has some fascinating behaviour in this area. Check this out:
CoreData example
import CoreData
let url = NSURL(fileURLWithPath: NSTemporaryDirectory(), isDirectory: true).appendingPathComponent("testDB.sqlite")!
print(url)
let stack = try! CDStack(url: url)
try! stack.doTest()
class CDStack {
let coord: NSPersistentStoreCoordinator
let moc: NSManagedObjectContext
init(url: URL) throws {
let model = NSManagedObjectModel()
do {
let ent = NSEntityDescription()
ent.name = "MyEntity"
do {
let valAttr = NSAttributeDescription()
valAttr.attributeType = .floatAttributeType
valAttr.name = "myProp"
ent.properties = [valAttr]
}
model.entities = [ent]
}
coord = NSPersistentStoreCoordinator(managedObjectModel: model)
try coord.addPersistentStore(ofType: NSSQLiteStoreType, configurationName: nil, at: url, options: nil)
moc = NSManagedObjectContext(concurrencyType: .mainQueueConcurrencyType)
moc.persistentStoreCoordinator = coord
}
func doTest() throws {
func printID(id: NSManagedObjectID) {
print("ID: \(id.uriRepresentation()) isTemp=\(id.isTemporaryID)")
}
let newObj = NSEntityDescription.insertNewObject(forEntityName: "MyEntity", into: moc)
newObj.setValue(NSNumber(floatLiteral: 3.141), forKey: "myProp")
// Print the object's ID. Should be temporary.
let tempID = newObj.objectID
printID(id: tempID)
assert(tempID.isTemporaryID)
// Save the MOC.
try moc.save()
print("Saved ✌️")
// Print the ID we got before the save. Should still be temporary.
printID(id: tempID)
assert(tempID.isTemporaryID)
// Print the object's ID. Should be non-temporary.
printID(id: newObj.objectID)
assert(newObj.objectID.isTemporaryID == false)
assert(tempID != newObj.objectID, "objectID should have changed")
assert(tempID.isEqual(to: newObj.objectID) == false, "objectID should have changed")
// But we can still fetch using the old (temporary) ID.
let fetchedTemp = moc.object(with: tempID)
let fetchedPerm = moc.object(with: newObj.objectID)
assert(fetchedTemp.objectID == tempID)
assert(fetchedPerm.objectID == newObj.objectID)
// Prints two different objects! '===' returns false!
print(fetchedTemp, fetchedPerm, fetchedTemp === fetchedPerm)
}
}
So after the context save, tempID
remains unchanged (i.e. essentially an immutable object), but newObj.objectID
now returns a different (permanent) ID. That said, we can still query the DB using the temporary ID, and if we do that we get some other object (not newObj
).
I'm sure there are reasons why they did it this way, and I think it shows that there are use-cases for reassigning an object's ID.
We could even go further with the renaming to avoid some confusion:
protocol IdentifiableContent {
associatedtype ContentIdentifier: Hashable
var id: ContentIdentifier { get }
}
That should make it clear this isn't about object identity.
Hello,
I wish the identifier would not be called id
or ID
, but something much longer like identifier
or identity
.
The reason why I think id
is a bad fit for this protocol is because types that will confirm it will often have existing identifiers which are not easy to rename, and will clash with the protocol: uid
, uuid
, fooId
, _id
. Yeah, real world plain data objects that map database records or JSON objects can have such properties.
We do not want the protocol to "mess" with those conventions, and introduce confusion.
My point of view is that identity
is a very good name for our purpose.
I would like to further push for identity
because this word matches the purpose of the protocol. Identity is what does not change when objects change. It is the only one constant quality of an object, stable and distinct from other objects of its kind.