An Implementation Model for Rational Protocol Conformance Behavior

Many recent threads and bug reports have focused on the intended semantic model for protocol conformances in Swift, among them:

When Swift implementers are faced with questions about the intended programming model for generics, they commonly respond with a focus on how generics are implemented, which, as the latter discussion shows, is not always compatible with the intended semantics. Furthermore, it makes a poor foundation for user (and especially my) understanding.

That said, having failed to clearly nail down the intended semantics independent of implementation concerns, I'd like to take a different tack here, by proposing a backward-compatible implementation model that supports the intended semantics as I understand them.

But what am I pitching, exactly? It's not necessarily that we must adopt my implementation technique for Swift, though I think it might be a big improvement. The most important thing is to form an agreement about whether the semantics implied by the technique represent the intention of the language. If so, we can then discuss whether there are other ways to achieve the desired result.

According to @Douglas_Gregor, the intention is that, at the moment a concrete type is bound to a generic parameter, we can determine all of the type's conformances to any protocols based entirely on declarations visible in the scope where the binding occurs. The idea is to capture all those conformances in a conformance table that can be used to quickly look up the witness table for any protocol modeled by the type. Protocol requirements and conformances (for existential downcasts) are then looked up by finding the relevant witness table in the conformance table. A pointer to the conformance table for any instance will be stored in a new value witness table entry.

This scheme is naturally compatible with scoped conformances and conflicting conformances in different modules: binding a type to a generic parameter (or creating an existential) in scopes with different conformances visible for that type simply results in distinct conformance tables.

12 Likes

I hate to be the downer twice, but it's simply not feasible to capture all of a type's visible conformances at the point of calling a generic function. Look at Array: it already conforms to eighteen protocols with just the frameworks in Apple's SDKs. Having the compiler to build a table with dozens of additional members that may or may not be used is a waste of binary size (the conformance table for every concrete type passed to a generic function) and launch time (resolving references to conformances in other dylibs), and I suspect there'd be a run-time cost too but I can't pin it down at the moment.

More importantly, however, while we can certainly add things to the value witness table if it's valuable to do so, the value witness table is attached to the type metadata, and type metadata is unique. You cannot have an Array<Int> from one part of the program with different type metadata than an Array<Int> from another part of the program, and only the type metadata and requested conformances are passed to a generic function. Changing this behavior (say, by passing a conformance table with every generic parameter list) is in theory possible but would be an ABI break.


I think there's just a general problem here regarding mixing dynamic casts with retroactive conformances. If you don't have retroactive conformances, there are only ever two places to search for conformances: "next to" the type, and "next to" the protocol. However, Objective-C has unconstrained retroactive conformances, and so Swift got them too, even though Swift's conformance model is different than Objective-C's.* And there are plenty of uses for retroactive conformances (which were pointed out to me when I tried and failed to restrict them). But I remain convinced that the answer is going to be changing the behavior of dynamic casts rather than the behavior of conformances.

I'm being all gloomy without proposing alternatives, but I don't have any great ones. Explicitly listing the "optional conformances" that would be checked dynamically is really no different than overloading, although it's a little more succint. Another alternative would be to change casting behavior so that it builds a conformance table at the location of the cast, and only checks those conformances (in addition to the ones in the type's module and the protocol's module). But I don't think that has any of the behavior you want.

This is just a hard problem. as? exists as a global search for conformances, and changing that will almost certainly break existing apps even if we wish it wouldn't. And we didn't even touch existentials in this discussion, which really can't take a context-specific conformance table either (especially for classes going through an AnyObject representation). But I really think the "irrational behavior" is on the part of as?, not conformances, and I don't think that all generic functions should pay a price so that as? gets more sensible behavior.

* In Objective-C, a conformance is "just" a promise to implement certain methods, which are looked up by selector like all other Objective-C methods. So retroactive conformances are no more dangerous a feature than adding your own methods to an existing class, though I suppose you're more likely to collide with someone else's method, in which case "last one loaded wins".

6 Likes

I understand the central point of your post to be an exploration of whether that statement of intention is the true intent.

In furtherance of that exploration, I'd like to use the example of SR-12881:

extension Collection {
  func generic_distance(from i: Index, to j: Index) -> Int {
    distance(from: i, to: j)
  }
}

extension RandomAccessCollection {
  var isRandomAccess: Bool { true }
}

let r: DefaultIndices<ClosedRange<Int>> = (0...10).indices
assert(r.isRandomAccess)
print(r.distance(from: r.endIndex, to: r.startIndex))
print(r.generic_distance(from: r.endIndex, to: r.startIndex)) // trap

Here, the indices instance property of Array<Int> vends a concrete instance of DefaultIndices<Element: Collection> with its type parameter, Element, bound to ClosedRange<Int>. If I understand your statement of intent correctly, at the moment of instantiation of the concrete instance of DefaultIndices, we should determine all of the protocol conformances of the instance and should do so based on whatever is visible within the current scope. Correct?

Because Int conforms to Strideable and SignedInteger, ClosedRange<Int> conforms to RandomAccessCollection. And, because ClosedRange<Int> conforms to RandomAccessCollection and is the Element type for this instance of DefaultIndices, this instance of DefaultIndices conforms to RandomAccessCollection. Also, because RandomAccessCollection itself conforms to BidirectionalCollection and, in turn, that protocol conforms to Collection, this instance of DefaultIndices also conforms to BidirectionalCollection and Collection.

Does your statement of intention stop with determining the protocols to which the given type conforms? Or does your statement of intention continue forward to the step of determining which implementation of a protocol's properties and methods would be dispatched for the type? I assume you intend the latter, but I may be mistaken.

If I understand your statement of intent correctly, at the moment of instantiation of the concrete instance of DefaultIndices, we would determine which implementations of the various applicable protocol required/provided properties and methods would be used by the type, based on what is visible from the current scope. Correct?

Here, the example focuses on the distance(from:to:) instance method. Each of the Collection, BidirectionalCollection and RandomAccessCollection protocols provides its own default implementation of this method. Based on the rule of specialization, the implementation provided by RandomAccessCollection would be selected. So, when distance(from: r.endIndex, to: r.startIndex) is called on this instance of DefaultIndices, the RandomAccessCollection implementation of that method is called.

Where the example turns to the generic_distance(from:to:) method, things get interesting. In the visible scope, that method is implemented only by the Collection protocol. Since this instance of DefaultIndices conforms to Collection, if the method is called on this type, Collection's implementation of the method will be used.

When Collection's implementation of generic_distance(from:to:) is called via this type, the method in turn calls to the distance(from:to:) method. One might think that that internal call would/should call to RandomAccessCollection's implementation of the distance(from:to:) method. But, it does not do so. Instead, it calls to Collection's implementation of the distance(from:to:) method. Leaving aside "why" that behavior happens, I would characterize the behavior as surprising, and not documented. I'm not sure whether the behavior is consistent or not with your statement of intent.

The "fix" offered for SR-12881 proposes a workaround of sorts that will provide the expected behavior. That workaround involves providing DefaultIndices with its own implementation of the distance(from:to:) method as part of its conformance to Collection. Again, I'm not sure whether that behavior is consistent or not with your statement of intent. Again, let's leave aside the "why" of how that workaround achieves the expected behavior.

Rather than enumerate all the reasons I think you may be mistaken about feasibility—which I'll get to eventually—I'll just say that you've missed the real point of my post, which @mattrips seems to appreciate. The point is to understand the intended semantics of the system in terms of an implementable (and not necessarily “feasibly implementable”) model. The one putative obstacle to implementability you mention has to do with the uniqueness of type metadata, which I was aware of already. It's not a problem, IIUC: the value witness table that's actually passed to the generic function or encoded in the existential doesn't need to be identical to the one referred to in the type metadata.

I get the point: we’d like to come up with desired semantics without pinning down an actual implementation first. I may have lost it in the middle of everything, but the point was that these semantics are not implementable without breaking ABI, and even then there would be significant costs. Therefore these are not my desired semantics for Swift; though they may be appropriate for another language, Swift’s generics should not have this kind of overhead associated with them.

To clarify this point, a generic function is not passed a type’s value witness table; it is passed its metadata.

EDIT: Even if it were completely free I’m not sure it’s a good model, because it increases the number of circumstances where code does something different if you move it somewhere else. But I think that it’s valid to say “this is not a good model for Swift specifically” even without talking about whether it’s a good model in the abstract; those two can be true simultaneously.

Okay, that was not the impression I got from the implementer I was consulting with, but surely there's someplace we can put a conformance table. Can we not store a pointer to it wherever @Douglas_Gregor was talking about putting his decision procedure ("in some runtime hooks we have during witness table realization")?

It seems to me that if there is absolutely no ABI place to tuck additional information into generic function call, there is no hope for a sane resolution of conflicting conformances in multiple modules without breaking the ABI.

Totally agree that that is a key part of the question.

Would very much like to learn your thoughts on what would be a good semantic model for Swift.

Ah, I think somewhere along the way you've gotten the witness table (the run-time structure that represents a conformance) mixed up with the value witness table (the run-time structure that describes basic operations on a type). (No, I don't like that they're so similar in name either.) For each generic parameter, a generic function (or type) is given both the generic argument's type metadata and the witness table (conformance) for each protocol constraint on the parameter. The type metadata is globally unique and the conformance is specific to the protocol, so neither of them are a great place to stash additional information about the call site.

2 Likes

The presence of multiple modules in a scope or the existence of invisible and/or hard-to-see factors (e.g., see the fix for SR-12881) already seem to affect method dispatching in hard-to-predict / hard-to-describe ways. Doesn't changing the scope within which code exists already (potentially) change how it functions?

Apologies for the serial response...

I admit I feel burned by this due to my past efforts on retroactive conformances, which is why most of my ideas nowadays tend towards just declaring outright that dynamic casts won't find any new types of conformance that we want to introduce. I do think there's room to improve around protocol hierarchies, but that since people don't usually conform to different layers of protocol hierarchies in different modules that it'd be simpler to come up with some kind of rule around that, whether it's formalizing patterns like withContiguousStorageIfAvailable or only searching in certain modules (based on already available information like a known conformance location, not call site) instead of doing a global search.

Yeah, it's true, we already do have that "problem" ("feature"?). I feel like collecting all visible conformances from a scope would magnify it because they're never explicitly named, whereas today with overload resolution / protocol witness checking you know which names are being looked up. But maybe it only feels that way to me as a compiler engineer, and a normal user of Swift wouldn't notice or care about the difference.

No, I wasn't confused about anything other than how the VWT is reached from inside a generic function. The VWT I know to be extensible and also present even for unconstrained generic parameters, so it seemed a good central place to stash additional information. But if it's only accessible through the type metadata, that's not going to work and there has to be another place to put the information.

1 Like

I'm talking about the witness table itself, i.e., the mapping the protocol requirements to specific witnesses that implement those requirements. At the point where (say) Array is defined to conform to Collection, we can record additional information about what other potential witnesses existed (but didn't apply) and choose among them.

I believe there is enough leeway there to address SR-12881 in a reasonable manner, if we capture---at the point of DefaultIndices conformance to Collection---the potential set of distance(from:to) witnesses for that conformance (one in the extension of Collection, another in the extension of BidirectionalCollection, a third in RandomAccessCollection) and delay the decision of which witness to use until we have a specific concrete generic argument type (e.g., DefaultIndices<ClosedRange<Int>>. That can probably be made to line up with what overload resolution would do for a call to DefaultIndices<ClosedRange<Int>>(...).distance(from:to:) when we have fully concrete types at type-checking time.

Are those the semantics you want?

Note that, due to retroactive conformances, there may be more than one witness table for a given conformance of a type X to a protocol P. Those witness tables may have different witnesses (of course), and both may be used at run time. This can result in having two types that would be spelled the same way in source code actually be distinct because they use different conformances. Let's make this concrete:

// Module A
public struct X { }

// Module B
import A
public extension X: Hashable  { ... }

public func getBSet() -> Set<X> { ... }

// Module C
import A
public extension X: Hashable  { ... }

public func getCSet() -> Set<X> { ... }

The types Set<X> produced by modules B and C are distinct, because they use different conformances of X: Hashable. The Swift runtime will keep them distinct, with different type metadata pointers (each of which includes the appropriate witness table).

The Swift compiler has known problems with this if you import both B and C into a fourth module, e.g.,

// Module D
import A
import B
import C

func test() {
  let setX1 = getBSet()
  let setX2 = getCSet()
  if setX1 == setX2 { ... }  // SHOULD be an compile-time error, probably won't be, might crash in runtime
}

PR #31895 is a step toward fixing this, but this is important: having multiple retroactive conformances affects type identity.

@jrose makes an excellent point about as? being tricky, which is what I brought up the above:

Let's say I have code like this:

func test(any: Any) {
  if let setOfX = any as? Set<X> { ... }
}

And somehow I end up feeding the results of getBSet() and getCSet() into it. What do we expect to happen? Does your answer change depending on which module defines the test(any:) function?

Another example:

func testHashable(any: Any) {
  if let h = any as? Hashable { /* assume Hashable doesn't have a Self requirement */ }
}

Now arrange for an instance of X to get passed in here. What do we expect to happen? Does your answer change depending on which module defines the testHashable(any:) function?

As those discussions and this one have shown, implementation concerns weigh heavily here, because a lot of what people abstractly want are not implementable in an efficient and/or backwards-compatible manner. That doesn't mean we should drive the discussion from the implementation, though. We have to know what actually semantics we want, and then the push-pull of design and implementability can kick in.

I'm unable to map from your statement of intention and implementation sketch to the semantics you want. Answering my bolded questions above would help me understand the goal, then I can tell you what is achievable w.r.t. implementation. You answered some similar questions in Ergonomics: generic types conforming "in more than one way" - #72 by dabrahams, but we're in a different thread and these kinds of semantic decisions haven't been recorded/explained anywhere. They really need to be, because I don't think anyone can understand the intended semantics without these kinds of small examples to reason through.

Doug

13 Likes

You're still seriously missing the point here. This implementation model doesn't necessarily imply any semantic change, except maybe in cases that are currently completely unspecified and broken today. What gets put in the conformance table can easily be the exact witness tables that are used for generic programs today.

What the implementation model does is:

  • makes it possible to resolve some problems that most of us agree are wrong and/or undesired without running into considerations that are purely based on what can be done within today's implementation.
  • provides a workable mental model for users to think about generic dispatching, which I claim we currently do not have.

For reference purposes and additional food for thought, readers of this thread might also revisit the 2018-era discussion at Retroactive Conformances v. Swift-in-the-OS.

1 Like

Thanks, @mattrips. I probably should have called it out more explicitly. If you're not worn out from reading, Backwards-deployable Conformances also talks about how to tackles some of the problems in Retroactive Conformances vs. Swift-in-the-OS, but focusing on libraries shipped with Apple OSs. It's less relevant to the current discussion, but I don't want to leave the impression that there's been no progress on the issues I/we raised originally.

1 Like

[Meta: I've been posting a lot and been (rightfully) called out on my negativity in private, so after this one I'm going to hold off on further replies till the weekend at least.]

This is a good point. Here's how I would summarize (what as I see as) the intent of our current model, ignoring dynamic casts, associated type inference, and the possibility of two conflicting visible conformances (one of Doug's examples):

  1. When a conformance MyType: MyProtocol is declared, the compiler goes through each requirement and asks what function would be called, given what is known at that point in the program:

    // Illustrative purposes only;
    // the compiler does not actually synthesize a function to do this.
    extension MyType {
      func __invoke_doSomething(with argument: Int) {
        self.doSomething(with: argument)
      }
    }
    

    If there are multiple overloads of doSomething(with:), the compiler picks the best one visible, just like it always does. The meaning of "best" is chosen at this point in time, which means that if one of the overloads is in a constrained extension based on a generic parameter of MyType, it won't get picked, because one implementation is picked to work with any MyType, whether it's MyType<Int> or MyType<[MyType<String?>]>. (It can take into account any constraints that are on the actual definition of MyType—perhaps that its argument has to be Equatable—as well as any constraints on the extension adding the conformance, if it's a conditional conformance.)

    All of this is the same (idea) as overload resolution, which does not do dynamic dispatch in Swift. You'll often find me reciting "there are three ways to do type-based dispatch in Swift: invoking a protocol requirement, calling an overridable method on a class, or actually checking the type with a dynamic cast".

  2. When a MyType value is passed to a generic function with a MyProtocol constraint or converted to an existential value, the compiler finds "the" conformance of MyType to MyProtocol, and passes that to the function. (The run-time representation of the conformance is the witness table, which contains all the "witnesses" to the requirements chosen in (1), as well as the associated types, and the conformances for any constraints on the associated types.) Note that this process doesn't look at the generic arguments of MyType except to populate the "associated types" section.

  3. Later on, when a protocol requirement is invoked (through a generic argument or an existential value), the "witness" found in the table in (2) is called.

That's the model as I understand it. You and Doug have already called out a weak point in (2)—"what should happen if two conformances are visible?"—compounded (in SR-12513) by leaky import visibility that lets the compiler find extension members and see conformances from an import that's in another file in the same module. And there's also the weak point in (1)—"requirements are resolved at the point where the conformance is declared"—which caused the problem with DefaultIndices in SR-12881.

(1) and (2) are also the places Doug called out as places to run a "decision procedure"—either the witness chosen in (1) would be a compiler-synthesized function that picked the best overload when invoked, or the fetching of the witness table in (2) could pick the best overload when created.


But your original point stands. Why is it hard for users to think about generic dispatching? I'd guess that it's the same reason many users have trouble with this code:

extension Collection {
  func whatAmI() { print("Collection") }
}
extension Array {
  func whatAmI() { print("Array") }
}
func whatIsIt<C: Collection>(_ value: C) {
  value.whatAmI()
}

func test() {
  let x = [1, 2, 3]
  x.whatAmI() // Array
  whatIsIt(x) // Collection
}

For better or worse, most people's model of what generic code ought to do seems to match up with what C++ actually does: it should pick the best option based on the value you pass. It's only us Swift implementers who can look at that and say "that's overload resolution at run time", and shrink away because we've designed a language where overload resolution is non-trivial.

Changing this particular case to "work" doesn't even require a conformance table; it "just" means checking that the concrete type is an Array. There are many reasons why even this concerns me from a performance and code size perspective. But even putting those aside, is that really the interpretation we want for the code above? For Swift, I don't think it is—it's too ad hoc. Nothing links the various whatAmI operations but their names.

(EDIT: But of course, that's all that links the choices in overload resolution too, so this may not mean much.)

5 Likes

I found your example compelling, but then thought about how it utilizes a non-required protocol method. Let's consider how using a required protocol method would change the result.

protocol P {
  func whatAmI()
}
extension P {
  func whatAmI() { print("P") }
}
extension Array: P {
  func whatAmI() { print("Array") }
}
func whatIsIt<C: P>(_ value: C) {
  value.whatAmI()
}

func test() {
  let x = [1, 2, 3]
  x.whatAmI() // Array
  whatIsIt(x) // Array
}

Our two examples, set side by side, reflect a fundamental difference in behavior. I suppose "many users have trouble" with your example due to this difference in behavior. But, I don't want to get this conversation hung up on subtleties of how things currently work.

I want to stay focussed on the question you pose about how we think it ought to work: "Is that really the interpretation we want?" I suppose my answer is, I could go either way on the interpretation. But, whatever the interpretation, I think it ought to apply the same way to both your example and my example.

1 Like

Having read a prior thread on the same subject, I kept thinking, and continue to think: Why isn't this a compile-time error?

Now, I understand your intent to be that this is and should be a compile-time error, but the compiler has a difficult time diagnosing it. If it is an error, do we need to consider it any further for purposes of talking about what the model should be?

I think this should be a compile-time error.

If there are two modules in scope that both supply a type called X, I believe it should be a compile-time error to use X without disambiguating. My answer doesn't change depending upon which module defines the function. If the reference to type X is ambiguous, it is an error.

In all of these cases, shouldn't the user be required to disambiguate the type-name collision at the point of use by prefixing the typename with the applicable module name?

1 Like

It depends on which "this" you're referring to:

  • It is not wrong for either B or C to declare a conformance of X to Hashable. Retroactive conformances are allowed in the language, but they're tricky. No compile-time error here.
  • A program may very well import both modules B and C, perhaps in separate modules that don't even know about each other. You can't really produce a compile-time error without seeing B and C in the same place, and you probably don't want to: B might be some hidden dependency, and bringing it into your program shouldn't cause your program to be wrong just by its existence.
  • The module D in the example should produce a compile-time error somewhere. The correct place, in my opinion, is on the use of ==: the == for Set only works when the two Set types are the same, and these Set<X> types are different because they involve different X: Hashable conformances.

For me, it depends on what modules have been imported. If we only have

import A

func test(any: Any) {
  if let setOfX = any as? Set<X> { ... }
}

then I agree that this must be a compile-time error, because X does not conform to Hashable.

However, I would say that

import A
import B

func test(any: Any) {
  if let setOfX = any as? Set<X> { ... }
}

is correct. Moreover, it should match the type produced by getBSet() but not by getCSet().

There's a similar case for the function if defined in a module that imports A and C.

However, I expect a compile-time error if both conformances of X to Hashable are visible:

import A
import B
import C

func test(any: Any) {
  if let setOfX = any as? Set<X> { ... }  // error: multiple conformances of X to Hashable
}

There is only a single type X here, which is defined in module A. If modules B and C are both included in your program, then at runtime we have to decide what any as? Hashable should do, given that (for example) the two X: Hashable conformances might have different hash values. There are a few options here:

  1. Crash the program, because we've hit an ambiguity and ambiguities are fatal at compile-time
  2. Arbitrarily select one of the conformances; optionally warn the user what we did
  3. Try to rank the conformances in some manner to give a well-defined answer, e.g., by using some information about the module in which the as? occurs
  4. Refuse to choose a specific conformance, and return nil instead so the program can continue

Doug

6 Likes

In your example, when Array conforms to P, the Array version of whatAmI() is the most specialized already, so you get the right answer. Add this to your example and it will better parallel @jrose's:

extension Array where Element: P {
  func whatAmI() { print("Array<P>") }
}
extension Int: P {
  func whatAmI() { print("Int") }
}

The output will be:

Array<P>
Array

That's the one that bothers me most. A protocol requirement is meant to be a point of dynamic dispatch, where you jump to a version of the code that's been customized for the concrete type... but we don't actually get all of the customization we want (or expect).

I'm less bothered by @jrose's example because I don't feel like a call to a function that is neither an overridable method in a class or a protocol requirement should allow different behavior with different types at runtime.

I appreciate that we're gathering a corpus of examples and opinions. One needs those to evaluate designs.

Doug

4 Likes