An Implementation Model for Rational Protocol Conformance Behavior

mattrips · June 5, 2020, 6:40am

Doug, thank you for your time and patience. My first pass was a bit sloppy.

I now see in your example that getBSet() and getCSet() inherently provide disambiguation. So, there is no error at that stage. I agree that the error should occur upon use of == with two different types.

Right. There is no error unless both B and C are imported. If both are imported, the error is triggered when the function ambiguously refers to Set<X>.

Why do you say there is only a single type X? We get two different types of Set<X>, "because they use different conformances of X: Hashable." So, shouldn't the two different X: Hashable also be considered as two different types? And, then, when the program attempts to instantiate an X without disambiguation, shouldn't that result in a compile-time error?

Thanks again. Good stuff.

Douglas_Gregor · June 5, 2020, 6:51am

There is a single type X, defined in A. It has a unique identity that is unchanged by it's conformances.

We have two different types spelled Set<X> in the source code because one is Set<X where X:Hashable comes from module B> and the other is Set<X where X: Hashable comes from module C>. One X, two Set<X>, because required conformances are part of type identity for uses of generic types.

X is not ambiguous. Trying to form the type Set<X> where two different conformances of X to Hashable are visible should be a compile-time error.

Doug

mattrips · June 5, 2020, 7:17am

Hmm. In Jordan's example, why shouldn't the Array version of whatAmI() be afforded the most "specialized" treatment?

Yuck.

When I further amend the example by removing the requirement that protocol P have a whatAmI() method, the output becomes:

// Array<P>
// P

Is there a semantic model underlying why we have different behavior depending upon whether or not a protocol-provided implementation is a requirement of the protocol?

To me it feels natural that a change in the concrete type could change behavior. But, I'm troubled by the idea that a function could produce different behavior depending upon whether or not it happens to be a protocol requirement.

mattrips · June 5, 2020, 7:27am

Got it. Thank you.

Can the existence of the duplicate Hashable conformances reasonably be detected at compile time at the point at which an X is instantiated, for instance as part of constructing the witness table for X? If so, should the conflicting conformances be considered an error?

Jens · June 5, 2020, 9:15am

Just to add my reaction as a humble user. For this particular example, my mental model of Swift's generics (which took a while to develop) allows me to not get surprised or confused. This example is in my opinion not showcasing any of the really-hard-to-understand parts. I include the example here again for completeness:

protocol P {
  func whatAmI()
}

extension P                      { func whatAmI() { print("P")        }  }
extension Array: P               { func whatAmI() { print("Array")    }  }
extension Array where Element: P { func whatAmI() { print("Array<P>") }  }
extension Int: P                 { func whatAmI() { print("Int")      }  }

func whatIsIt<C: P>(_ value: C) {
  value.whatAmI()
}

func test() {
  let x = [1, 2, 3]
  x.whatAmI() // Array<P>
  whatIsIt(x) // Array
}
test()

By looking at this program I can predict (and not get surprised) by the output. The generic function whatIsIt has a typeparameter C which is constrained only by C: P which means that the compiler knows nothing about C except what it knows about P. That is, it knows about P's requirement whatAmI() and can call that required method (as implemented by various particular Cs), but it does so without knowing anything about eg Element or Array, it cannot get any information from any if its call sites, it's just its signature that matters.

The only way to make whatIsIt print "Array<P>" too (which I guess might be what some would expect?) would be to rewrite the example in a way that lets the compiler know more/enough about C. As the example is currently written, it is clear that within the body of whatIsIt, the compiler has no idea about Element: P or that C can sometimes be an Array. It only knows that some C conforms to P and if that C happens to implement P's requirement rather than use a default implementation, it will call C's implementation.

That said, there are many other things I get confused by in Swift's generics, like this example which @dabrahams linked to in the OP. (Note that I do not have a clear expectation about what the intended/best/fixed behavior should be in that example, especially so when taking @Nevin's "fun fact" into account.)

But perhaps my confusion there is just caused by the same rules I claim to understand here, I haven't thought about it enough to know if that might be the case.

filip-sakel · June 5, 2020, 9:58am

@Douglas_Gregor said “A protocol requirement is meant to be a point of dynamic dispatch“ and I completely agree. Also in you example the method that is declared conditionally when the Array’s Element conforms to P, I don’t see why the final output would be correct. For example, we could write:

protocol P { func whatIAm() }

...

extension Array: P {
    func whatIAm() { print(“Array”) }
    
    func whatIAm() where Element: P 
        { print(“Array<P>”) }
}

When written this way, I think the user’s expectation would be that since the second method is more specific it will be called instead of the first one, especially when considering that whatIAm() is a protocol requirement.

Jens · June 5, 2020, 10:34am

That will not compile:

protocol P { func whatAmI() }

extension Array: P {
  func whatIAm() { print("Array") }

  func whatIAm() where Element: P { // ERROR: 'where' clause cannot be attached to a non-generic declaration
    print("Array<P>")
  }
}

dabrahams · June 5, 2020, 3:06pm

Thanks for all the replies, everybody! I've had a busy end-of-week, but I'll try to respond in the next few days.

filip-sakel · June 5, 2020, 5:36pm

Jens:

That will not compile:

protocol P { func whatAmI() }

extension Array: P {
  func whatIAm() { print("Array") }

  func whatIAm() where Element: P { // ERROR: 'where' clause cannot be attached to a non-generic declaration
    print("Array<P>")
  }
}

Oh sorry, I was compiling on 5.3. Nonetheless, It's the same as what you wrote. The point I wanted to make is that the expectation of the user when seeing the code written this way - I think - would be that the more specialised method (the second one) be called.

Jens · June 5, 2020, 8:11pm

Ah, I didn't know that was supported in 5.3, Here is the complete program with your modification, which compiles with Swift 5.3:

protocol Q { func whatAmI() }

extension Int: Q {
  func whatAmI() { print("Int") }
}

extension Array: Q {
  func whatAmI() { print("Array") }
  func whatAmI() where Element: Q { print("Array<P>") }
}

func whatIsIt<C: Q>(_ value: C) { value.whatAmI() }

func test() {
  let x = [1, 2, 3]
  x.whatAmI() // Array<P>
  whatIsIt(x) // Array
}
test()

The result is the same, and FWIW it's still what I'd expect from that code, for the same reasons as before:

(P being Q in this modified example.)

I can see the point, and I remember having to go through a phase of "relearning" (because I was so used with C++ templates). The thing is that in the context of whatIsIt<C: Q>(…), C cannot get more specialized than that; the compiler's knowledge stops at C: Q, and Q says nothing about C having an Element.

mattrips · June 6, 2020, 5:55am

This is so funny. My reaction of "Yuck" was because I would want to see a result of either P or Array<P>, but definitely not the middle ground of Array.

When you say, "it does so without knowing anything about eg Element or Array, it cannot get any information from any if its call sites, it's just its signature that matters," I don't understand what you mean. If that were so, how does whatIsIt know enough to call Array's implementation of whatAmI, but not enough to call Array<P>'s implementation of whatAmI?

All of this is getting a bit deep in the weeds. That may be my fault. I'd like to steer back towards the central point of the pitch:

dabrahams · June 6, 2020, 6:51am

Then I suppose we could at least do what I've proposed for constrained generics, e.g. by tucking the information into (a copy of) the first witness table. If there's truly no way to extend the information passed to unconstrained generics, it would mean…

I hope this is irrelevant because of what I've learned below about distinct types

making the semantics of dynamic downcasting a value of generic parameter type to a protocol existentials inconsistent with those of generic dispatching in some cases. We'd have the option to limit the inconsistency to cases where the generic parameter is unconstrained. Happy to elaborate more if this isn't obvious to you. I'm still hoping there's a way out of this jam, but I don't think it's fatal to the idea.

I must admit I'm a bit confused as to why several people have posted in this thread about addressing that bug via compiler changes, since it has a fix in the library. If it's because you think the existing standard library code should work as written, then that's news(!) to me and we're into new territory in two dimensions:

We'd have an opportunity to describe the rules that make that code work
We'd be changing the semantics of existing code

Others will disagree I know, but I'm not at all opposed to #2. In fact think it is necessary in some areas of the generics system (whether for SR-12881 remains to be seen). I'm an enthusiastic supporter of #1, which, it seems to me, is obviously a precondition for #2. Also before we move on to #2 I'd like to talk about how the rules of #1 line up with the solutions to other problems in the generics system that seem closely related to me. In particular:

Problems with algorithm specialization and conditional conformances.
Problems with conformances in multiple modules (edit: good news—I see this one might almost be considered fixed if someone would just document how it's supposed to work!)
The inability of anyone other than the owner of a protocol to extend the set of specializable algorithms that operate on models of the protocol.

I believe that it's possible to resolve all three of those within the implementation model I'm pitching here, which is part of why I pitched it. If you'd rather talk about it in terms of programming models, that would be great. As I noted at the beginning of the thread I've had some trouble making headway with that tack, but I'm happy to try again.

Thanks for asking, but I'm a bit bewildered by the question. I filed SR-12881 to report that library components aren't working properly. As I'm sure you know, I've filed plenty of bugs about the generics system recently, but that isn't one of 'em. Mostly though, I don't have any grasp on how what you've described generalizes or what rules would govern the semantic change going into effect. Obviously, I'm not interested in a compiler change that just affects that one example, and I'm pretty sure that's not what you mean to propose, but please can you describe the programming model or the implementation model in a way that tells me more than just how you expect the compiler to deal with one or two pieces of code?

I'm aware.

Let's make this concrete:
// Module A
public struct X { }

// Module B
import A
public extension X: Hashable  { ... }

public func getBSet() -> Set<X> { ... }

// Module C
import A
public extension X: Hashable  { ... }

public func getCSet() -> Set<X> { ... }
The types Set<X> produced by modules B and C are distinct, because they use different conformances of X: Hashable . The Swift runtime will keep them distinct, with different type metadata pointers (each of which includes the appropriate witness table).

You mean, it already does that? That's wonderful! I suppose one can square this with Jordan's assertion that type metadata is unique for a given type because they're considered different types, but that's actually just what I'd like to see happen with scoped conformances and basically would allow the conformance table I've pitched in my implementation model to be stashed in the value witness table.

I hate to ask, but is any of this (and the implications for program semantics) written down somewhere?

The Swift compiler has known problems with this if you import both B and C into a fourth module, e.g.,
// Module D
import A
import B
import C

func test() {
  let setX1 = getBSet()
  let setX2 = getCSet()
  if setX1 == setX2 { ... }  // SHOULD be an compile-time error, probably won't be, might crash in runtime
}
PR #31895 is a step toward fixing this, but this is important: having multiple retroactive conformances affects type identity.

I'm totally fine with that (not the runtime crash, but that multiple overlapping conformances affects type identity).

@jrose makes an excellent point about as? being tricky, which is what I brought up the above:

jrose:

But I really think the "irrational behavior" is on the part of as? , not conformances, and I don't think that all generic functions should pay a price so that as? gets more sensible behavior.

Let's say I have code like this:
func test(any: Any) {
  if let setOfX = any as? Set<X> { ... }
}
And somehow I end up feeding the results of getBSet() and getCSet() into it. What do we expect to happen? Does your answer change depending on which module defines the test(any:) function?

Great question! This sounds tricky at first but the more I think it through, the more obvious the answer becomes: the cast fails (produces nil) unless the type metadata for Set<X> at the point of the cast matches the type metadata pointer for any, i.e. unless the X used to form any conforms to Hashable the same way that an X instance would if it were created locally. Simply put, if the type of any and the local meaning of Set<X> are distinct, the cast fails.

Another example:
func testHashable(any: Any) {
  if let h = any as? Hashable { /* assume Hashable doesn't have a Self requirement */ }
}
Now arrange for an instance of X to get passed in here. What do we expect to happen? Does your answer change depending on which module defines the testHashable(any:) function?

This one is easy; I expect the semantics laid out in the scoped conformances pitch. The cast succeeds iff a conformance of X to Hashable was visible in the scope where X was first bound to a generic parameter or the X instance was converted to an existential. The Hashable existential instance behaves according to that conformance.

Yes please! I just want to start by getting to a common language for describing the semantics we want. Having previously failed to have the discussion from an abstract programming model POV, I pitched an implementation model here that

I could easily understand
Seemed consistent with your stated intent
Was flexible enough to express the semantics I want, if deployed in one particular way
Seemed implementable (and apparently it is?) so would not immediately draw fire on that score (#FAIL!)

No surprise; I didn't lay out any semantics in this pitch, because every time I try to describe the semantics I want—and I promise I'm not pointing any fingers here—it feels to me like someone more familiar than me with the current implementation either:

declares the semantics unimplementable and out of the question, or
proceeds from the current implementation as a given in a way that closes off discussion of possible semantics.

By pitching a possible implementation model, I was trying to set up a basis for the discussion that wouldn't immediately lead to a dead end.

Sorry I couldn't answer the first question; I just don't have the context I need to do so yet. I think I answered the other questions pretty solidly, though. But my actual goal here was to lay the groundwork for any kind of productive discussion of semantics at all. In the thread you referenced, I asked you many direct questions to try to get a handle on your design intent. When the reply tersely declined to address most of them I decided that maybe I needed to try a different angle. I'm very glad that we seem to be engaging productively on this topic now!

Amen to that, brother! I've just been trying to unlock the semantic decisions from the brain of the person I presume to have made them. And small examples are definitely necessary, but as a means of illustrating semantics that can be described more generally.

mattrips · June 6, 2020, 7:41am

Looking at the overall protocol system, the behavior of non-required methods declared in protocol extensions is the source of much confusion about which available version of a method will be called in a given situation.

For instance, see Paul Hudson's discussion of this aspect of the protocols system. After noting the surprise that it causes, he explains:

By declaring a [required] method in the protocol you’re allowing conforming types to override it; if you don’t [i.e., if you declare the method only in an extension of the protocol], they can’t [override that method] unless they have a specific type.

The bracketed portions are my additions to provide context.

Unless I'm mistaken, the intention is: a protocol's non-required method is not overridden by a conforming type when that type is used as an existential. That behavior is a feature. Hudson explains that the purpose of the feature is to provide uniform, non-overrideable behavior for types that conform to a protocol.

However, that non-overriding feature exists only when the conforming type is treated as an existential type. If the conforming type is used as itself, it is able to override the method otherwise supplied by the protocol extension.

All of this contributes towards what is a pretty complicated mental model as to which version of a given method will be called in a given context.

Unfortunately, the conformance behavior of non-required methods declared in protocol extensions is not formally documented (AFAIK). In fact, there isn't even a formal name for these non-required protocol methods (AFAIK).

Also, please see this proposal regarding the user having more control over how this aspect of protocol conformance is handled.

There are a variety of other macro aspects of the system that have similar impacts on the mental model, such as the interplay between the class-subclass hierarchy and protocol conformance. Perhaps someone else could offer a discussion of that aspect of the matter.

Jens · June 6, 2020, 8:09am

I agree that the system is complicated. It took me a lot of time, questions and experimentation to build a (mostly) working mental model, and I often see people on these forums struggle with the same issues as I once had. I also agree that the system (its intended behavior) really needs to be documented and explainable in a way that is accessible, terse and complete.

anandabits · June 6, 2020, 2:00pm

This is close, but not correct. Generic code constrained in terms of the protocol will also see the non-overridable protocol extension. Existentials are not involved in this in any way.

dabrahams · June 6, 2020, 4:45pm

Think hard: are you sure there aren't any other such tiny details you'd like to ignore? Although a mental model that fails to account for all those things is not of much use to me, you do mention a few things that I think I can usefully engage with:

The mindset that normal overload resolution can be used to decide protocol requirement satisfaction leads inevitably to many of the problems we have today. It leaves us in a terrible bind:

We don't want to do it at runtime based on full dynamic type information, because
- It would mean encoding way too much information into binaries that is currently known only during compilation.
- It would consider way too many candidates for each requirement, some of whom might not have been intended to satisfy requirements.
- It therefore produces hard-to-control/reason-about nonuniformity in how a given piece of generic code works
- It admits ambiguity at runtime into the system (though I think that door is somewhat open already for other reasons¹) that, using overloading as our model, leaves us in the unenviable position of trying to express a compilation error at runtime
But if we do it at compile-time, we end up with
- Inconsistency between what a protocol requirement does when the static type is known and when it is not that at least appears to be different from the inconsistency you'd expect due to simple static overload resolution.
- The inability to express basic ideas from generic programming

IMO we need rules for requirement satisfaction that

Don't depend on knowing everything about how a particular syntactic construct would statically dispatch based on the fully-elaborated static types.
Can be efficiently encoded into binaries (unless we find some way to ensure most generic code is monomorphizable).
May have a lot in common with overload resolution rules, but stand on their own
Come with a strategy for runtime resolution/prevention of ambiguity, because AFAICS it is otherwise inevitable.

It would probably help a lot to have a way to be explicit about which definitions can satisfy which protocol requirements (what I think you mean by “link[ing]… operations with their names”), so we don't have to think in terms of open-ended overload resolution.

Saying it's the same as the reason for something else does little to actually answer the question, and I think your example gives those of us struggling with how things work too little credit for understanding the basic mechanics of protocols. In that case, there are no protocol requirements involved whatsoever; it's pure overloading, so it's not hard to see why the code acts as it does.

Well, that's not my mental model. C++ does generic dispatch via unrestricted static overload resolution in the presence of all the knowledge the compiler has including the fully elaborated static types. Overload resolution in Swift is still less complicated than in C++ (at least I hope so; hard to tell for sure because Swift's rules aren't documented!), so I don't think that really explains why we shrink away. The reasons we don't do generic dispatch via overload resolution in Swift are:

From an implementation POV, unlike for C++, the compilation process has ended when Swift would need to do it, and we can't reasonably encode all that data in the binary.
From a usability POV, as you say, it's too ad-hoc, and thus hard to reason about. This is part of what makes C++ hard to use.

dabrahams · June 6, 2020, 6:11pm

I'm not sure it's really sound (enough) to treat differently-conforming Xs as the same type, because you can't assume the importance of the conformance has been captured in any enclosing generic type's generic parameter constraints. I'll try to make an example:

// Module A
extension Array where Element: Equatable {
   /// Returns true iff `self` contains any duplicate elements.
   /// - Complexity: O(n^2) where n == `count`
   func containsDuplicates() -> Bool { ... }
}

struct X {}

// Module B
import A
extension X: Equatable { 
  static func == (a: X, b: X) -> Bool {…}
}
/// Some Xs with no duplicate values, per equality as defined here.
public let uniqueXs: [X] = Array(Set( … ))

// Module C
import A
extension X: Equatable { 
  static func == (a: X, b: X) -> Bool {…} // Different equality
}

extension Array where Element: Equatable {
   /// - Requires: `!self.containsDuplicates()`
   func doSomething() { precondition(!self.containsDuplicates()) }
}

// Module D
import A
import B
import C

print(B.uniqueXs.containsDuplicates()) // true?
B.uniqueXs.doSomething()               // crash?

mattrips · June 6, 2020, 7:36pm

@Douglas_Gregor, it seems to me that:

(1) It should be an error for a type, in any given visible scope, to conform twice to the same protocol requirement. It already is an error if it occurs twice in the same module, regardless of whether the type actually is used. In the context of two imported modules with colliding conformances for a given type, the error would not occur until the type is used in the current scope.

(2) A user may avoid that error through disambiguation (e.g., prefixing with a module name).

(3) When a user resorts to (2), the disambiguation effectively (practically speaking) results in two different types being present in the visible scope (and the program).

Douglas_Gregor · June 6, 2020, 7:40pm

No, we cannot know about the duplicate conformances in the general case until we query for a specific conformance. And I’m the implementation Today, a type doesn’t carry its own conformances along with it; they are in a separate place where you can look up (eg) the witness table that makes X conform to Hashable.

Douglas_Gregor · June 6, 2020, 7:44pm

I agree with (1)-(3). It’s mostly what we have today, modulo copious bugs.

Today you can “disambiguate” for (2) only crudely, by hiding the duplication from the compiler (eg, in the implementation details of two modules that don’t import each other).

One still has to decide what “as?” does when duplicates exist and are only discovered at runtime.