SE-0261: Identifiable Protocol

donny_wals · July 9, 2019, 7:58am

Overall +1 but I think the default implementation could lead to unexpected bugs, it's better if the conforming type provides its id (or some other name) explicitly.

ktraunmueller · July 9, 2019, 10:01am

What is your evaluation of the proposal?

-1. It tries to attach the concept of identity to value types. To me, that's a contradiction and source of confusion. Values can be compared for equality, but they do not have identity. That's a simple, easy to grasp, and fundamental assumption in the split between value and reference types in the Swift type system.

Is the problem being addressed significant enough to warrant a change to Swift?

No-ish. I think this partly comes out of SwiftUI's design choice of representing views using value types, and for (likely) technical reasons, there needs to be a way to update existing view instances, which requires some means of identifying, or distinguishing between, instances of a value type. So this comes out of a very specific use case or design approach of one specific system (SwiftUI).

I dig the collection diffing application, though. Maybe in some more constrained form provide this as an addition to the standard library.

Does this proposal fit well with the feel and direction of Swift?

No. Identity of values types is something that will probably be used with different semantics by different people. I think it's a source of confusion. If you need identity, use reference types.

If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?

n/a

How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

Quick reading.

sveinhal · July 9, 2019, 10:13am

Although simple to grasp, it is also not true. Many models represent a snapshot of a de-facto identifiable record. This proposal makes it easier to work with that fact, even if it does mean that the programmer needs to reason about the difference between record identity and reference semantics. However, that problem exists independent of the proposed protocol.

hartbit · July 9, 2019, 11:27am

Example:

struct Person {
    let id: UUID
    let firstName: String
    let lastName: String
}

let uuid = // ...
let beforeWedding = Person(id: uuid, firstName: "Jane", lastName: "Doe")
let afterWedding = Person(id: uuid, firstName: "Jane", lastName: "Smith")

beforeWedding != afterWedding
beforeWedding.id == afterWedding.id

Same person, different data. This protocol allows a generic way to tell the difference.

ktraunmueller · July 10, 2019, 7:57am

A Person definitely has identity, agreed. But for this reason, a Person is conceptually not a value (type) to me. "Use a reference type instead" would be my approach to this.

Using a different example: it shouldn't be possible to make e.g. a Rect (Point, etc.) Identifiable. That would just go against my mental model of a Rect.

Along the same lines, I don't see why collection diffing for Rects would make sense. Comparing arrays of Rects in terms of equality, that's fine, but saying "this value has moved two slots in the collection" doesn't fit my mental model of values.

I can definitely see Identifiable and collection diffing as a valuable addition to the standard library (the container types, collection protocols and collection algorithms live there as well), but it would need to be somewhat more constrained. As I said, Identifiable Rects should not be possible.

tomguthrie · July 10, 2019, 8:49am

This protocol allows you to model Person as a value type though, and solves so many problems with state and threading that modelling database values as reference types introduce for example.

Yes it doesn't make sense for Rect to be Identifiable but it also doesn't make sense for it to be ExpressibleByIntegerLiteral but you could do that if you wanted to? It already has a constraint on having to provide an id property isn't that enough?

ktraunmueller · July 10, 2019, 8:54am

Passing copies of values around introduces different problems (e.g., updating a copy won't update the original value, which can lead to subtle bugs).

That's not a constraint, that's a protocol requirement.

hartbit · July 10, 2019, 9:24am

I have considerably reduced the amount of bugs in my database code since I dropped reference types and instead used value types copies. I won't be going back to reference types anytime soon.

Avi · July 10, 2019, 9:36am

Value semantics lead one to discover the concept of one-source-of-truth. Once you get there, you never go back. Data inconsistency bugs become trivial to diagnose and fix.

sveinhal · July 10, 2019, 10:22am

These are tools. You can use them and you can misuse them. That Identifiable has the potential to be used in a way that confounds the nuances of identity is no different from classes having the potential to be used in lieu of value types, or vice versa.

Identifiable has real value for real problems. Use it when it makes sense. But use something else, when it doesn't.

ktraunmueller · July 10, 2019, 10:38am

That's a very good angle at the problem. Taking this pragmatic viewpoint (the standard library provides a set of tools, use them wisely), I'd change my overall evaluation to +1.

frameworklabs · July 10, 2019, 10:53am

For the mental model of identity, it helps me to think about Entity-Component-System frameworks. Components hold some state of an Entity - either as structs or classes. An Entity itself can also be modeled as either struct or class. If it is a struct then it needs a unique value in some domain - maybe an index into the list of all Entities or a UUID. If it is a class, the address of the object might serve that purpose.

If a Component would adopt Identifiable then id should return the identity of the Entity, even if it is an instance of a class.

Karl · July 10, 2019, 3:05pm

To reiterate: this has literally nothing to do with value vs. reference types. Structs and enums are great, classes are great, and it's all totally irrelevant to this topic. Even classes can have a separate notion of record identity which transcends their reference identity/memory address (see CoreData, NSManagedObjectID).

Essentially, this conformance communicates that a value is part of a larger, non-trivial dataset, e.g. one contact in a database of contacts. Other kinds of types (like, say, FileManager) don't have a concept of record identity because they are not elements of any meaningful higher-order container like a table or graph.

I am disappointed that the proposal text has not been changed to clarify this. It has led to lots of confusion in this review.

anandabits · July 10, 2019, 10:15pm

How long does combine need to store the bag of seenItems identifiers? Is it possible for these identifiers to outlive all Combine-owned copies of the value (possibly an object reference) that vended the identifier? Or does Combine always store a copy of the value at least as long as it stores the identifier in seenItems?

I asked earlier but didn't get an answer. What is the reason this API uses an existential instead of a generic constraint?

allevato · July 11, 2019, 3:53pm

I'm coming a bit late to the heavy discussion above about naming conflicts with the id property, but I had some related thoughts that might make the idea more palatable, with a little bit more help from the language.

I agree with the folks who think that it's a non-goal to try to come up with a name for this property that isn't going to collide with someone's existing code. While it's nice if that can be done, the overall design and readability of the protocol shouldn't have to suffer by being made more obfuscated.

The Swift language has an internal attribute that almost lets us have the best of both worlds: @_implements lets you declare that a property, method, or associated type implements a particular requirement of a protocol even if it has a different name (this is similar to C#'s explicit interface implementation concept).

Right now, there's just one problem: if that other name still happens to be an existing declaration on the conforming type, you end up with an ambiguity:

protocol Identifiable {
  var id: String { get }
}

struct Record {
  var id: String { return "Record.id" }
}

extension Record: Identifiable {
  @_implements(Identifiable, id)
  var idForIdentifiable: String { return "Record(Identifiable).id" }
}

let r = Record()
print(r.id)  // Desired: "Record.id", but error below 🙁 

let i: Identifiable = r
print(i.id)  // Desired: "Record(Identifiable).id"

main.swift:15:7: error: ambiguous use of 'id'
print(r.id)
      ^
main.swift:6:7: note: found this candidate
  var id: String { return "Record.id" }
      ^
main.swift:11:7: note: found this candidate
  var idForIdentifiable: String { return "Record(Identifiable).id" }

I would propose two things (which certainly shouldn't be combined with this proposal, but which offer a path that may ease the concerns in the discussion above):

@_implements should be made public.
Modify the behavior of @_implements to remove the ambiguity; in the example above if you refer to id on an instance of the concrete type Record, then it would only refer to the concrete type's property and not the renamed protocol requirement.

bzamayo · July 11, 2019, 4:17pm

Yeah, a couple years ago on the mailing list there was also chatter of syntax like var Identifiable.id: Int floating around that would achieve the same result.

There are clearly ways that, long-term, property name collisions could be disambiguated at the language level. I think using id is fine as proposed.

Tino · July 11, 2019, 4:46pm

It's probably not the first occurrence, but here's one mention I remember:

(even that is very outdated, though :-)

I'd favor the name that is most likely to clash with existing code (that's the best one ;-) — and if we would ever get a way to explicitly refer to a protocol member, that could resolve the conflict... but this feature wouldn't be for free: Suddenly, calling id on an object could produce different results, depending on the context.

Update: The syntax might be reusable for How to unambiguously refer to a symbol defined in an extension in third party module?

JaapWijnen · July 11, 2019, 6:30pm

It would be cool to consider an implementation that doesnt require to initialize the id value when initializing the ‘Identifiable’ eg make id optional, this would also make the proposal fit more server side swift applications. For example Vapor has this implemented in a similar fashion in their Fluent package. Where the object has an optional property id and also implement a function ‘requireID()’ that throws when id is nil. (Ids are initialized when the object is saved to a database for example) this works quite nice and allows users to either unwrap the optional themselves or call a throwable function.

anandabits · July 11, 2019, 6:35pm

The proposal supports conformances where typealias ID = UUID? and similar where the identifier itself is Optional. If you want your identifier to be optional you can do that. It would be undesirable to define the protocol with var id: ID? because then it would not be possible to have a non-Optional id property.

JaapWijnen · July 11, 2019, 6:41pm

Ah totally overlooked that possibility, awesome.