Need Help Understanding Protocols and Generics

xwu · June 20, 2020, 8:23pm

Hmm, it is very difficult to figure this out indeed.

As you know, there are three ways to get dynamic dispatch in Swift: classes (using overridable class members), protocols (by implementing protocol requirements), and dynamic casting. In other cases, when two declarations have the same name and (but for their name) would be visible in the same scope, one declaration is said to shadow the other. This behavior is very distinct from dynamic dispatch, as you know also.

The examples you give here mix all three ways of getting dynamic dispatch. The behavior could not be more complicated. Having puzzled over this in detail, I do wonder if Swift is doing something unexpected in terms of how these features interact with each other. But I'm having trouble even thinking about this advanced case without my head hurting; let me show you my work so far:

I'll begin by saying that this is my mental model of how Swift does dynamic dispatch. No doubt it is not 100% accurate, because nothing except "whatever the compiler currently does" is 100% accurate, and that cannot be modeled.

Ultimately, we will have to decide if the overarching system is what we want, but that is about a hundred steps ahead of where we are. First, we need to enunciate what that system is in a way that doesn't refer to its implementation. Then, once we have a consensus on that answer, we need to fix the bugs in the implementation of that system, document it, and find a way to teach it. Once we have done that, then we can perhaps talk intelligently about what we don't like about what we have.

With that in mind, here it goes, an attempt to describe how dynamic dispatch works without reference to the actual implementation details:

Example 1

protocol P { var id: String { get } }
protocol Q: P { }
extension P { var id: String { "P" } }
extension Q { var id: String { "Q" } }
struct S: Q { var id: String { "S" } }

let s = S()
s.id // "S"

Does id need to be dynamically dispatched?
No, because it is a member of a struct.

If not, which implementation is statically chosen?
S.id.

Example 2

// Continued from example 1.
let q = s as Q
q.id // "S"

Does id need to be dynamically dispatched?
Yes, because we have dynamically cast s to Q, and Q.id is a protocol requirement.

If so, via the interface of what type or protocol?
Via P, because that protocol declares the requirement within the hierarchy of Q (to which we have dynamically cast the value).

Via that interface, which implementation is chosen?
S.id, because it is the "most specific" implementation available for P.id for this value.

Example 3

// Continued from example 2.
func f<T: P>(_ x: T) { print(x.id) }
f(s) // "S"

Does id need to be dynamically dispatched?
Yes, because P.id is a protocol requirement.

If so, via the interface of what type or protocol?
Via P, because that protocol declares the requirement within the hierarchy of P, which is the only interface we have given the generic constraint.

Via that interface, which implementation is chosen?
S.id, because it is the "most specific" implementation available for P.id for this T.

Example 4

// Continued from example 3.
protocol Q1: P { var id: String { get } }
extension Q1 { var id: String { "Q1" } }
struct T: Q1 { }

let q1 = T() as Q1
q1.id // "Q1"

Does id need to be dynamically dispatched?
Yes (see example 2).

If so, via the interface of what type or protocol?
Via P, because that is the protocol that declares the requirement within the hierarchy of Q1. Note that Q1 does not define a new, distinct requirement; it merely restates it.

Via that interface, which implementation is chosen?
Q1.id, because it is the "most specific" default implementation available for P.id.

Example 5

// Continued from example 4.
struct Generic<U>: P { }
extension Generic: Q where U: Equatable {
  var id: String { "Generic: Q where U: Equatable" }
}

let g = Generic<Int>()
g.id // "Generic: Q where U: Equatable"

Does id need to be dynamically dispatched?
No (see example 1).

If not, which implementation is statically chosen?
Generic<Int>.id, which is implemented in the extension Generic: Q where U: Equatable.

Example 6

// Continued from example 5.
// Recall the definition of `f`:
//   func f<T: P>(_ x: T) { print(x.id) }

f(g) // "P"

Does id need to be dynamically dispatched?
Yes (see example 3).

If so, via the interface of what type or protocol?
Via P (see example 3).

Via that interface, which implementation is chosen?
P.id, because it is the "most specific" implementation available for P.id for this T.

Note that Swift requires every T == Generic<U> to behave identically. We can think of it as dynamic dispatch looking only "one level deep" when it comes to generics: it can look through T to see Generic<U>, but it doesn't look through U to see Int.

Example 7

// Continued from example 6.
class C1: P { var id: String { "C1" } }
class D1: C1 { var id: String { "D1" } }

let d1 = D1()
d1.id // "D1"

Does id need to be dynamically dispatched?
Yes, because it is an overridable member of a (non-final) class.

If so, via the interface of what type or protocol?
Via C1 , because that is the (non-final) class that defines the overridable member. D1 does not define a new, distinct overridable member; it overrides a superclass member.

Via that interface, which implementation is chosen?
D1.id, because it is an override of C1.id.

Example 8

// Continued from example 7.
let c1 = d1 as C1
c1.id // "D1"

Does id need to be dynamically dispatched?
Yes, because we have dynamically cast d1 to C1, and C1.id is an overridable member of a (non-final) class.

If so, via the interface of what type or protocol?
Via C1 , because that is the (non-final) class that defines the overridable member within the hierarchy of C1 (to which we have dynamically cast the instance).

Via that interface, which implementation is chosen?
D1.id, because it an override of C1.id for this instance.

Example 9

// Continued from example 8.
class C: P { }
class D: C { var id: String { "D" } }

let d = D()
d.id // "D"

Does id need to be dynamically dispatched?
Yes, because it is an overridable member of a (non-final) class.

If so, via the interface of what type or protocol?
Via D, because that is the (non-final) class that defines the overridable member (see example 10).

Via that interface, which implementation is chosen?
D.id, because there is no more specific override.

Example 10

// Continued from example 9.
let c = d as C
c.id // "P"

Does id need to be dynamically dispatched?
Yes, because we have dynamically cast d to C, and C.id is a (theoretically) overridable member of a (non-final) class, even though it is impossible for subclasses actually to override it. So far as D is concerned, we're allowed to declare D.id because C.id can't be marked final, but D.id can only shadow but not override C.id because D.id can't be marked override. (See examples below for additional discussion.)

If so, via the interface of what type or protocol?
Via C, for the reasons given above.

Via that interface, which implementation is chosen?
C.id, which is to say P.id. For the reasons given above, D.id shadows but cannot override C.id.

Example 11

// Continued from example 10.
// Recall the definition of `f`:
//   func f<T: P>(_ x: T) { print(x.id) }
f(d) // "P"

Does id need to be dynamically dispatched?
Yes, because P.id is a protocol requirement (see example 3).

If so, via the interface of what type or protocol?
Via P, because that is the protocol that declares the requirement within the hierarchy of P.

Via that interface, which implementation is chosen?
P.id, because it is the "most specific" default implementation available for P.id. For the reasons given in example 10, D.id shadows but is not an implementation of P.id.

Example 12

// Continued from example 11.
func fcq<T: C & Q>(_ x: T) { print(x.id) }
fcq(d) // "Q"

Does id need to be dynamically dispatched?
Yes, because C.id is an overridable member of a non-final class, even though it is impossible for subclasses actually to override it. (See example 10.)

If so, via the interface of what type or protocol?
Via C, because that is the non-final class with the overridable member.

Via that interface, which implementation is chosen?
Q.id, because it is "more specific" than P.id, which would otherwise be the implementation for C.id.

Example 13

// Continued from example 12.
func fq<T: Q>(_ x: T) { print(x.id) }
fq(d) // "P"

Does id need to be dynamically dispatched?
Yes, but now because Q.id is a protocol requirement.

If so, via the interface of what type or protocol?
Via P, because that is the protocol that declares the requirement within the hierarchy of Q.

Via that interface, which implementation is chosen?
The non-overridable but shadowed C.id, which is to say the default implementation P.id, is ranked "more specific" than Q.id because it is inherited by the concrete type via subclassing, whereas Q.id is only a default implementation.

This is starting to get strange...

Example 14

// Continued from example **1**.
// Recall that we've defined the following protocols:
//   protocol P { var id: String { get } }
//   protocol Q: P { }
// ...and default implementations:
//   extension P { var id: String { "P" } }
//   extension Q { var id: String { "Q" } }

class C { }
class D: C, Q { var id: String { "D" } }
(D() as (AnyObject & Q)).id // "D"

Does id need to be dynamically dispatched?
Yes, because we have dynamically cast an instance to AnyObject & Q, and without a concrete implementation of (AnyObject & Q).id, Swift looks to the dynamically dispatched protocol requirement.

If so, via the interface of what type or protocol?
Via P, because that is the protocol that declares the requirement within the hierarchy of Q.

Via that interface, which implementation is chosen?
D.id, because it is the "most specific" implementation of P.id.

Example 15

// Continued from example **1**.
// Recall that we've defined the following protocols:
//   protocol P { var id: String { get } }
//   protocol Q: P { }
// ...and default implementations:
//   extension P { var id: String { "P" } }
//   extension Q { var id: String { "Q" } }

class C: P { } // Note the difference here from example 13.
class D: C, Q { var id: String { "D" } }
(D() as (AnyObject & Q)).id // "P"
(D() as Q).id // "P"

Does id need to be dynamically dispatched?
Yes, because we have dynamically cast an instance to AnyObject & Q or Q, and without a concrete implementation, Swift looks to the dynamically dispatched protocol requirement.

If so, via the interface of what type or protocol?
Via P, because that is the protocol that declares the requirement within the hierarchy of Q.

Via that interface, which implementation is chosen?
P.id (see example 13).

As discussed in example 4, Q.id is not a new requirement distinct from P.id, and D can conform to P in only one way because types can conform to protocols in only one way. That one way is inherited from C, even though D implements its own conformance to Q. Meanwhile, D.id can shadow C.id for the same reason as in example 10.

Without inheritance in the mix, Swift does consider Q.id as "more specific" than P.id:

class C2: Q { }
(C2() as Q).id // "Q"

struct S: Q { }
(S() as Q).id // "Q"

Example 16

// Continued from example 15.
(D() as (C & Q)).id // "Q"

Does id need to be dynamically dispatched?
Yes, because we have dynamically cast an instance to C & Q and C.id is an overridable member of a non-final class, even though it is impossible for subclasses actually to override it.

If so, via the interface of what type or protocol?
Via C (see examples 10 and 12).

Via that interface, which implementation is chosen?
Q.id (see examples 10 and 12 for reasoning).

Example 17

// Continued from example **4**.
// Recall that we've defined the following protocols:
//   protocol P { var id: String { get } }
//   protocol Q: P { }
//   protocol Q1: P { var id: String { get } }
// ...and default implementations:
//   extension P { var id: String { "P" } }
//   extension Q { var id: String { "Q" } }
//   extension Q1 { var id: String { "Q1" } }

class C: P { }
class D: C, Q, Q1 { var id: String { "D" } }
(D() as (C & Q)).id // "Q"

Does id need to be dynamically dispatched?
Yes (see example 16).

If so, via the interface of what type or protocol?
Via C (see example 16).

Via that interface, which implementation is chosen?
Q.id (see example 16).

Example 18

// Continued from example 16.
(D() as (C & Q1)).id // "P"

Does id need to be dynamically dispatched?
Yes (see example 16).

If so, via the interface of what type or protocol?
Via C (see example 16).

Via that interface, which implementation is chosen?
?!?!?!