An Implementation Model for Rational Protocol Conformance Behavior

jrose · June 5, 2020, 1:00am

[Meta: I've been posting a lot and been (rightfully) called out on my negativity in private, so after this one I'm going to hold off on further replies till the weekend at least.]

This is a good point. Here's how I would summarize (what as I see as) the intent of our current model, ignoring dynamic casts, associated type inference, and the possibility of two conflicting visible conformances (one of Doug's examples):

When a conformance MyType: MyProtocol is declared, the compiler goes through each requirement and asks what function would be called, given what is known at that point in the program:
```
// Illustrative purposes only;
// the compiler does not actually synthesize a function to do this.
extension MyType {
  func __invoke_doSomething(with argument: Int) {
    self.doSomething(with: argument)
  }
}
```
If there are multiple overloads of doSomething(with:), the compiler picks the best one visible, just like it always does. The meaning of "best" is chosen at this point in time, which means that if one of the overloads is in a constrained extension based on a generic parameter of MyType, it won't get picked, because one implementation is picked to work with any MyType, whether it's MyType<Int> or MyType<[MyType<String?>]>. (It can take into account any constraints that are on the actual definition of MyType—perhaps that its argument has to be Equatable—as well as any constraints on the extension adding the conformance, if it's a conditional conformance.)

All of this is the same (idea) as overload resolution, which does not do dynamic dispatch in Swift. You'll often find me reciting "there are three ways to do type-based dispatch in Swift: invoking a protocol requirement, calling an overridable method on a class, or actually checking the type with a dynamic cast".
When a MyType value is passed to a generic function with a MyProtocol constraint or converted to an existential value, the compiler finds "the" conformance of MyType to MyProtocol, and passes that to the function. (The run-time representation of the conformance is the witness table, which contains all the "witnesses" to the requirements chosen in (1), as well as the associated types, and the conformances for any constraints on the associated types.) Note that this process doesn't look at the generic arguments of MyType except to populate the "associated types" section.
Later on, when a protocol requirement is invoked (through a generic argument or an existential value), the "witness" found in the table in (2) is called.

That's the model as I understand it. You and Doug have already called out a weak point in (2)—"what should happen if two conformances are visible?"—compounded (in SR-12513) by leaky import visibility that lets the compiler find extension members and see conformances from an import that's in another file in the same module. And there's also the weak point in (1)—"requirements are resolved at the point where the conformance is declared"—which caused the problem with DefaultIndices in SR-12881.

(1) and (2) are also the places Doug called out as places to run a "decision procedure"—either the witness chosen in (1) would be a compiler-synthesized function that picked the best overload when invoked, or the fetching of the witness table in (2) could pick the best overload when created.

But your original point stands. Why is it hard for users to think about generic dispatching? I'd guess that it's the same reason many users have trouble with this code:

extension Collection {
  func whatAmI() { print("Collection") }
}
extension Array {
  func whatAmI() { print("Array") }
}
func whatIsIt<C: Collection>(_ value: C) {
  value.whatAmI()
}

func test() {
  let x = [1, 2, 3]
  x.whatAmI() // Array
  whatIsIt(x) // Collection
}

For better or worse, most people's model of what generic code ought to do seems to match up with what C++ actually does: it should pick the best option based on the value you pass. It's only us Swift implementers who can look at that and say "that's overload resolution at run time", and shrink away because we've designed a language where overload resolution is non-trivial.

Changing this particular case to "work" doesn't even require a conformance table; it "just" means checking that the concrete type is an Array. There are many reasons why even this concerns me from a performance and code size perspective. But even putting those aside, is that really the interpretation we want for the code above? For Swift, I don't think it is—it's too ad hoc. Nothing links the various whatAmI operations but their names.

(EDIT: But of course, that's all that links the choices in overload resolution too, so this may not mean much.)