An Implementation Model for Rational Protocol Conformance Behavior

Douglas_Gregor · June 24, 2020, 11:47pm

If I define an algorithm in an extension on Collection in module A, can I add a new specialization of that algorithm in module B?

We're at the point where talking in the abstract is not really helpful. I know what algorithm specialization is in the abstract, and Swift's overloading allows you to express algorithm specialization when working entirely with concrete types, but I don't know what you're thinking for expressing algorithm specialization in a manner that meets your expectations.

Doug

dabrahams · June 25, 2020, 12:19am

If I define an algorithm in an extension on Collection in module A , can I add a new specialization of that algorithm in module B ?

Yes, being able to publicly vend a generic algorithm that can be composed with other generic algorithms and specialized for a specific model of the protocol it operates on, from the module where that model is defined, is important.

FWIW, this capability can be mapped directly onto support for retroactive protocol refinement without raising any of the dynamic casting burdens that are the only obstacle to the feature mentioned by the Manifesto. In any case, I'm trying to say this capability is a long term goal of mine that—conceivably—might not even look like protocol extensions: maybe it's some kind of multimethod. I don't think it affects the work we do in this thread much, except that I'll keep my eyes open for anything that definitively rules out solving the problem eventually.

We're at the point where talking in the abstract is not really helpful. I know what algorithm specialization is in the abstract, and Swift's overloading allows you to express algorithm specialization when working entirely with concrete types, but I don't know what you're thinking for expressing algorithm specialization in a manner that meets your expectations.

I was trying to define algorithm specialization for the context of this discussion about generic dispatching. But if that bugs you, forget about it. I believe my reply to you contains concrete, non-abstract answers to everything you've asked me, so if there are unanswered questions please re-ask them.

David_Sweeris · June 25, 2020, 3:20am

Hopefully Apple will announce their iClone project along with their new iPhones this fall.

dabrahams · June 27, 2020, 9:43pm

Hey, @Douglas_Gregor!

I've prepared this gist to demonstrate something I keep asserting but I can't tell whether you agree with: given the lack of monomorphizability, some ambiguities in witness selection cannot possibly be detected until runtime if you want to change semantics in the way we've been discussing.

total_swiftification · July 27, 2020, 1:52pm

So... What is the current consensus? Fixing this by the 6.0 release?
Because I have just come to a great idea, which is: let's add to a process of archetype construction a notion of context (that is the current module, aka the most recent refinement).

//module S
protocol A { var uo: Int {get} }
extension A { var uo: Int {42} }
struct Alpha: A {} 
//this has archetype S.Aplha with witness from S.A

//module T
import S
extension Alpha: A { var uo: Int {69}}
//this has archetype T.Alpha with witness from T.Alpha
//Yes, single redeclaration in different module should be allowed

From here, we dance: compiler builds the internal code from external modules and then it becomes a different realm (nothing can touch it from public domain), then it selects the witnesses (redeclarations) in the current root (in the example above it is module T), and after, it descends the imports graph, picking the witnesses of most recent refinement from external modules for missing conformances.
What do you think about that kind of a monomorphizing solution?

Checkout this btw

Joe_Groff · July 27, 2020, 7:27pm

As you are probably well aware, a conformance is abstractly a function that takes a type and returns a set of witnesses. If we want to be able to make these witness choice decisions deterministically, then we need to somehow incorporate the relevant conditional conformances into the inputs to that conformance function. Today, in a conformance like your example struct X<T, U>: P, the only input to the conformance is the conforming type X<T, U>, which by itself carries no conformance information about T or U, so in the most general case, there's too little information to see the conditional conformances without doing global lookup (and thereby introducing nondeterminism due to the shared mutable nature of the global conformance set). If we had a mechanism for making the relevant conditional conformances be direct inputs to the conformance, that could provide a way out of the problem. One way to do this might be to have a formal concept of optional generic constraints; these could work by telling the type system to grab a protocol conformance for the constraint if it has it, or pass null otherwise. That would let you declare your type as (stealing ?Protocol syntax from Rust as a strawman):

struct X<T: ?Equatable, U: ?CustomStringConvertible> { ... }

That would let us know that, any time we instantiate X, we capture the conditional conformances for the generic arguments when we have them. Instantiations would notionally lower like this:

let xis = X<Int, String>.self // swift_getGenericMetadata(X, Int, String, `Int: Equatable`, `String: CustomStringConvertible`)

struct Foo {}
struct Bar {}

let xfb = X<Foo, Bar>.self // swift_getGenericMetadata(X, Foo, Bar, nullptr, nullptr)

We would probably also want to raise errors or warnings for non-concrete generic arguments that don't forward the optional requirements, to ensure that the information gets plumbed through:

func lossyGeneric<A, B>() -> X<A, B> // error: instantiation of X<A, B> loses optional requirements on T = A and U = B
func losslessGeneric<T: ?Equatable, U: ?CustomStringConvertible>() -> X<T, U> // ok

With the optional requirements captured into the metadata, then checks whether T conforms to Equatable, or U conforms to CustomStringConvertible, can be answered using local data, and unlike as? Protocol checks today, these local state checks can be made to behave deterministically, and reliably specialized away when we monomorphize. That would also enable your example case of choosing a P.foo witness based on the conformances of T and U to be done deterministically.

OTOH, if we treat optional requirements the way we do requirements today, that would mean new optional requirements can't be added retroactively to types, and couldn't even be added by the API author without breaking ABI, which wouldn't be very expressive. I can however see us having more flexibility with optional constraints in protocols. A protocol could use optional constraints to capture conditional conformance information into its conforming types, which would be appropriate for capability towers like *Collection, where it is common to want to deterministically query the higher capabilities of a collection. If the protocol were declared:

protocol Collection: ?BidirectionalCollection, ?RandomAccessCollection /*etc.*/ { ... }

that would direct the compiler to capture the BidirectionalCollection and RandomAccessCollection conformances for a type when available, and make them available for deterministic capability checking on any type that conforms to Collection. The situation for extension, both proactively and retroactively, is a bit more promising too. Protocols can already add new requirements resiliently, so if the standard library expands the Collection hierarchy, it can also extend Collection with optional constraints to capture the extended hierarchy. The story for retroactive expansion is a bit sadder within the confines of the language today, but if we had the ability to extend protocols with new dispatched requirements, that could also conceivably be used to introduce new optional requirements, and give a third party library the ability to expand the Collection hierarchy as well.

dabrahams · July 29, 2020, 4:52pm

Yep.

If we want to be able to make these witness choice decisions deterministically, then we need to somehow incorporate the relevant conditional conformances into the inputs to that conformance function.

If you said, “if we want to make the witness choices based on conditional conformances of X…” I'd agree. I think determinism is totally desirable, but also orthogonal. Am I missing something?

Today, in a conformance like your example struct X<T, U>: P , the only input to the conformance is the conforming type X<T, U> , which by itself carries no conformance information about T or U, so in the most general case, there's too little information to see the conditional conformances

Agreed so far.

without doing global lookup

When you say “without doing global lookup” I think I know what you mean: that lookup can't proceed from any data structure that is currently passed as a parameter into the context where the decision must be made at runtime. Is that right?

One of the questions I keep asking in this thread is whether there is any way of expanding the information passed into these contexts, or whether we have in fact frozen that limitation into the ABI. It seems to me that, because conformances can already differ across module contexts, the witness table for T: P (for any type T and protocol P) can't be assumed to be unique across the whole program, and, so long as there is any constraint at all on the generic, witness tables themselves could serve as such a vehicle. Why couldn't that work?

(and thereby introducing nondeterminism due to the shared mutable nature of the global conformance set).

This part seems a little too significant to pass off in a parenthesized aside. How do you imagine this becomes nondeterministic? Are you thinking of conformances coming in from dynamically loaded shared libraries?

Joe_Groff:

If we had a mechanism for making the relevant conditional conformances be direct inputs to the conformance, that could provide a way out of the problem. One way to do this might be to have a formal concept of optional generic constraints; these could work by telling the type system to grab a protocol conformance for the constraint if it has it, or pass null otherwise. That would let you declare your type as (stealing ?Protocol syntax from Rust as a strawman):
struct X<T: ?Equatable, U: ?CustomStringConvertible> { ... }
That would let us know that, any time we instantiate X , we capture the conditional conformances for the generic arguments when we have them.

I've got a few of problems with this approach. To begin with, I find the notion of an “optional constraint” as absurd and meaningless as that of an “optional requirement,” which you'll remember we had to cope with for interoperability reasons, but relegated to Objective-C protocols to keep them from weakening the language. These things don't appear to constrain anything. You could say, “OK, we'll pick a better name,” but I think the poor name comes from the fact that the meaning of the feature is hard to describe and understand.

Next, the approach seems like it's entirely directed at mechanics of the language implementation that are—and should remain—below the level of the programming model. To understand what this does, and why it might be needed, one has to think about conditional conformances of generic types, how witness tables are generated, etc.

Most importantly, it seems to make the programming model much worse. The situation today is that simple, valid code appears to have a straightforward meaning but doesn't. Adding the ability to complicate code with these ?-“constraints” doesn't fix that: the same simple code would have the same surprising and hard-to-describe meaning. Your proposal adds another wrinkle to all the things the authors and consumers of these generics need to consider.

The feature doesn't lead me to any obvious understanding of how to use the language to express the classic ideas of generic programming. I would certainly find ways to get the system to do what I wanted, and would probably even be able to identify some idioms and programming techniques, but I can almost guarantee that you'd loathe the results even more than you abhor the hackery of C++ type-based metaprogramming .

Lastly, as I've remarked in another thread that I can't find at the moment (but where IIRC you didn't challenge my assertion), I don't think advance declaration of which conformances may be relevant to witness selection should be necessary, because the potential relevance of a conformance can be known from declarations that are visible in the same way that overload sets are resolved based on which declarations are visible.

The problem with having to do this where Collection is declared is that one doesn't always know the shape of these “capability towers” at the point where their roots are discovered. It's very similar to the reason people want to be able to retroactively add requirements to a protocol: the set of interesting customizable operations is not known at the point where the protocol is declared.

Lantua · August 6, 2020, 1:17pm

I think we can semantically separate a function availability into two posets of scopes and constraints. Conformances and functions are associated with a scope and a constraint. And the scope-constraint pair of a function call determines which the implementation to use.

Let's use 'scope to denote the scope. The scopes of protocols, conformances, and functions are tied to where they are declared. Say, we have

// Module Base
public protocol Foo {
  func bar()
}

extension Int: Foo { // Int:Foo'Base
  func bar() { ... } // Int.bar'Base
}

/// Module A
import Base // 'Base < 'A

In this example, Foo conformance of Int, and Int.bar all have the scope set to 'Base. The set of all scopes then admits a partial ordering. I'm not defining the ordering here, but import relation is a good candidate. In this example, 'Base < 'A.

We could separate 'A into 'internalA and 'publicA. For the sake of simplicity, I'll omit that for now.

Let's use '(constraint) to denote a constraint. This easily forms a poset using a more specific relation. For example:

           '(T == Int)
              ^    ^
             /      \
'(T: Comparable)   '(T: Hashable)
            ^        ^
             \      /
         '(T: Equatable)

So '(T: Equatable) < '(T: Comparable), etc.

Now, let's call a pair of scope and conformance a visibility. I'll use ''visibility1 to denote the visibility. We also define a partial order over visibility using the product order on Scope x Constraints. That is, ''v1 < ''v2 iff:

The scope of ''v1 < the scope of ''v2, and
The constraint of ''v1 < the constraint of ''v2.

Now that we defined visibility, every function call is associated with visibility:

A.foo() // A.foo''caller()

Then, we check all foo''candidate implementations where ''candidate < ''caller and choose the maximal one according to the partial ordering of the visibility.

To demonstrate, consider:

// Base
protocol Foo {
  func foo()
}
extension Int: Foo {
  // foo'Base'(Self == Int) or foo''Base
  func foo() { ... }
}

// Module A
import Base

func test() {
  // foo'A'(Self == Int) or foo''caller
  Int().foo()
}

In this case, assuming ''Base < ''caller, then foo''Base is a candidate for the function, if there are another function with visibility ''another which ''Base < ''another < ''caller, we will use that function instead.

The important part is that we orthogonalize the visibility of a function into scope and constraint, which can be dealt with separately. There are still a few gaps in the model:

I haven't specified the partial ordering of the scopes. I believe that, currently, we could say that each file has three scopes:
- 'file: scopes for fileprivate declarations,
- 'internalModule: scopes for internal declarations, and
- 'publicModule: scopes for public declarations, including protocol conformances
with the import rule saying that:
```
'publicImportee < 'publicImporter
```
If we're to have scoped conformance, we can put protocol conformances in 'internalModule, or even 'file instead of 'publicModule.
I haven't said how the visibility of function calls are constructed. This is where we fill in the information from this discussion, for example, with as? operator, we could say that it queries a runtime, application-wide lookup:
```
a as? Protocol
// Succeed if 'runtime'(Self: Protocol) < 'runtime(Self == concreteA), break ties arbitrarily
```
or that it queries the scope where a is created, which we'd also need to attach scopes to each type/generic parameter:
```
func foo<A'generic>(a: A'generic) {
  ...
  a as? FooProtocol
  // Succeed if `'generic'(Self: FooProtocol) < 'generic'(Self == concreteA)`, break ties arbitrarily.
```

I think the model is flexible enough, given the gap we can fill, to fit essentially any conformance model that's realistically implementable. It should also be easy to explain as (Swift) programmers should already be familiar with the conformance and scope partial ordering given it is relatively common to break ties between functions with different constraints.

If anything, I think attaching a visibility name to the function function''v1 and having partial ordering over visibility ''v1 < ''v2 would make the discussion go smoother, even if visibility is not a cartesian product of scope and constraint.

dabrahams · August 15, 2020, 8:46pm

Bump @Joe_Groff. I asked a lot of questions in that post and would be grateful if you could address them!

Thanks