Unlock Existential Types for All Protocols

anthonylatsis · November 7, 2020, 4:56pm

Sorry I wasn't clear enough. Allow me to paraphrase instead of breaking down my previous reply: in practice, what exactly would the understandability/explainability barrier refer to, were we to lift the restriction now? I expect this to be something other than the well-known "protocol cannot conform to protocol" error that has been around forever, or the fact that oftentimes a portion of the interface would still be unavailable (regardless of any new syntax), not by design as with mutating, but because we lack a mechanism like path-dependent types that would allow us to invoke them in a type-safe manner.

dabrahams · November 7, 2020, 6:34pm

It's all outlined in my posts from the thread I referred to, and others have contributed useful insights as well. Please take the time to read that thread. Nothing substantial has changed since then and nobody has really addressed the concerns.

Moximillian · November 7, 2020, 8:24pm

What also has not changed is the question that is it better to have specific errors on specific problems, which each can be refined over time to explain specific situation better, — or keep on getting hit by the wall of confusion that is the current "protocol cannot conform to protocol" error?

In my opinion the current situation is actively harmful to anyone learning about generics and existentials.

owenv · November 7, 2020, 8:54pm

I think this is a very important point. We've more or less exhausted everybody's ideas on how to improve the diagnostics situation for existential restrictions in the current language design. Lifting restrictions has the potential to substantially improve error messaging because the remaining restrictions could be explained as consequences of the language design that can be derived from first principles, instead of side effects of the compiler implementation.

Considering this example from the implementation PR:

protocol P1 {
   associatedtype Q
   func returnAssoc() -> Q
}

struct S1: P1 {
   typealias Q = Int
   func returnAssoc() -> Q { 0 }
}

 let p1: P1 = S1()
 let x = p1.returnAssoc()

In Swift 5.3, this gives two errors:
error: protocol 'P1' can only be used as a generic constraint because it has Self or associated type requirements and error: member 'returnAssoc' cannot be used on value of protocol type 'P1'; use a generic constraint instead, both of which are so vague that in practice nobody understands them without prior knowledge of advanced language concepts. When we've tried to increase the specificity and educational value of these diagnostics, the language design has worked against us because it's hard to provide a clear, logical explanation for implementation-driven restrictions without revealing details of the compiler implementation to end users.

The current implementation of this proposal changes the error to member 'returnAssoc' cannot be used on value of protocol type 'P1'; use a generic constraint instead. This is more specific and I'd argue it's an improvement on its own, but IMO we could easily improve on this and emit something like member 'returnAssoc' cannot be used on value of protocol type 'P1'; because it returns associated type 'Q', it may only be used on a concrete conforming type. The because is key here; we now have a specific, unambiguous reason the user's code is invalid, and it's one that can easily be explained in more detail by supporting documentation.

xwu · November 7, 2020, 9:35pm

This is a key insight here--the part about "in the current language design."

However, I don't think the valiant effort that follows gets us very far, because lifting the restrictions as pitched here would build on rather than alleviate what I think is the main barrier for users: namely, that the diagnostic isn't merely insufficiently explanatory but rather the rule to be explained is inexplicable.

This is a key distinction. We can make diagnostic messages ever longer, write reams of explanatory notes on the situation, but the basic situation remains:

There is a type named P and a protocol named P, and they are spelled the same but they are not the same. This is the status quo for protocols without associated types, so the pitch would change nothing here merely by extending it to PATs.
With restrictions lifted, however, now we have to contend with requirements such as returnAssoc on the protocol P. The reason you can't invoke returnAssoc in the example above is that the type P doesn't actually have a member returnAssoc.

Yes, there's the part where we can explain why you can't use a member with a return value of associated type, and yes that's important. But that's not the major challenge here. We get tied in knots because we're talking about a non-member "member," because of course returnAssoc is a "member" of P because it's plainly visible as such. Thus, the task of explaining the design to the user will always entail somehow a justification why P.returnAssoc simultaneously exists and does not exist. Accepting this requires suspending one's basic instinct to reject the absurd.

The purpose of proposing a spelling such as any P is nowhere as ambitious as what @dabrahams is aiming for here, which I understand to be a self-contained explanation in Swift's own syntax of the underlying first principles behind limitations using PATs. No, the point of any P is much more humble, yet indispensable in my view: without attempting to make a self-explanatory design, it merely makes two different things look different so that the design is at least explicable.

anthonylatsis · November 7, 2020, 9:51pm

Putting it this way makes it sound like a genuine dead end, when in reality the ability to invoke these members is just a missing feature.

xwu · November 7, 2020, 10:14pm

For the user, that's a distinction without a difference.

anthonylatsis · November 7, 2020, 10:56pm

Perhaps, but the true reason we cannot use them remains the same whether or not the existential is made separate. any P would only make the "protocol cannot conform to protocol" part explicable.

xAlien95 · November 7, 2020, 10:57pm

anthonylatsis:

anandabits:
I think we want path-dependent types here.
protocol P {
  associatedtype B: Class
  var b: B { get }
  func takesB(_: B)
}

func takesP(p1: P, p2: P) {
  let b1 = p1.b 
  let b2 = p2.b
  p1.takesB(b1) // ok
  p2.takesB(b1) // error: b1 may be of the wrong type
  p2.takesB(b2) // ok
  p1.takesB(b2) // error: b2 may be of the wrong type
}
As an aside, I imagine we could get this done by covariantly erasing to an opaque type.

I would greatly appreciate existentials' Self and associated types erasure (or hiding) to opaque types. Opaque types are self-explanatory for unexperienced users (I guess, I hope) and have good diagnostics too. You immediately understand why p2.takesB(b1) isn't valid since the following error gets prompted:

Cannot convert value of type '(some P).B' (associated type of protocol 'P')
to expected argument type '(some P).B' (associated type of protocol 'P')

giving the user to assumption that "some P" and "some P" may be different types.

dabrahams · November 8, 2020, 2:50am

Your points about the situation being inexplicable are well-taken, if slightly overstated. But I would agree that it's certainly not something that can be explained in any diagnostic of reasonable length.

That's not very fair; I've already indicated in this thread that I don't consider it to be an explanation.

I do think my proposed syntax would make it almost impossible to teach people about existentials without also giving them an explanation for an “inexplicable[-in-diagnostics]” result, as the Swift book does. The any P syntax would not discourage anyone from presenting existentials without calling attention to the link between constraints and missing API. I also think the syntax I proposed will make it much harder for people to forget about that link when actually using existentials.

dabrahams · November 8, 2020, 3:02am

By mentioning “protocol cannot conform to protocol” I think you may be proving one of my points: lots of people who want generalized existentials are confused about what the feature actually provides and imagine it will solve problems that it will not. Protocol self-conformance is not fixed by generalized existentials.

I want self-conforming existential/protocols too, but to make that work we need a completely different feature: the user needs to be able to declare that the protocol self-conforms and the compiler needs to be able to prevent you from adding requirements that are incompatible with self-conformance. And I'd be perfectly happy if that category of existential types was spelled by the unadorned protocol name.

We're all in agreement about that. We disagree about whether “simply lifting the restriction” results in a better, and less confusing, world for Swift programmers. I'm simply not content to open the floodgates to all of the issues I've raised about generalized existentials and hope that someday, somehow, improving diagnostics will lead to a good result. That's not a plan. Once you make a fundamental mistake in language design it is usually extremely difficult to walk it back.

If someone can lay out a design that actually deals with the problems (or give reasons why the problems are just in my imagination), even if not all of it can be implemented right away, then we have something to talk about.

Karl · November 8, 2020, 3:14am

A missing feature caused by a broken model, IMO. We can apparently access members of existentials, even though the existential doesn’t actually have any members and just forwards to the contained object.

To actually fix this, we need to stop accessing members via this forwarding mechanism, unbox the object and access its members directly. Overall this would lead to a much simpler and more easily understood model.

Yes, we can bend the existing model a bit further, but I don’t think we should. I suspect the remaining limitations will still leave us with something that is frustratingly complex and incomplete.

Karl · November 8, 2020, 3:22am

Can you explain why you want this?

AFAIK the only reason for self-conformance is to make existentials play nicely with generics, and there are much better options for that which don’t require new kinds of protocols or impose restrictions on what they may include.

dabrahams · November 8, 2020, 3:28am

That's why. When you have a protocol and a bunch of models, and need to add another model that works via type erasure, it's very inconvenient that the existential doesn't work directly.

and there are much better options for that which don’t require new kinds of protocols or impose restrictions on what they may include.

I'm listening…

Karl · November 8, 2020, 3:49am

Unboxing. So if you have a function:

protocol MyProto {}

func foo<T: MyProto>(_ value: T)

Currently, when you try to call that with an existential of type MyProto, the compiler will attempt to bind T == MyProto (the existential), and will fail:

let existential: MyProto = ...
foo(existential) // T == MyProto. Error: MyProto does not conform to MyProto.

Instead, the contained value should be unboxed (either manually or by the compiler) to some local generic type, and that is what should be used instead:

let existential: MyProto = ...
unbox(existential) { <X: MyProto>(unboxedValue: X) in
  foo(unboxedValue) // T == X. X: MyProto is true, all is good.
}

Inside the scope, X is a real generic type. I don't see any reason why we couldn't enable the full expressiveness of generic code for X.

For X to be meaningful outside this local scope, we may need to introduce something that represents "an unknown type satisfying some constraints" (note that this is different to existentials, which represent "any type which satisfies some constraints").

To illustrate the difference, consider an Array<MyProto> - each element is an independent existential, and may have a different type to its neighboring elements. There is currently no way in the language to express that all elements have the same type -- and that even though we don't know which specific type they are, we know that it conforms to MyProto.

Alejandro · November 8, 2020, 4:21am

Karl:

Instead, the compiler should unbox the contained value to some local generic type, and use that instead:
let existential: MyProto = ...
unbox(existential) { <X: MyProto>(unboxedValue: X) in
  foo(unboxedValue) // T == X. X: MyProto is true, all is good.
}

I haven't been following along in detail to this thread, but just saw this and wanted to comment, any particular reason you want the compiler to do this vs. forcing ourselves to manually open the existential? I.e.

let existential: MyProto = ...
let <X: MyProto> openedMyProto = existential
foo(openedMyProto)

It would certainly be less magical, but I view that as a good thing because now it's clear after opening the existential that the function takes a generic parameter conforming to the type vs. an existential.

Karl · November 8, 2020, 4:32am

No - you’re right, I probably shouldn’t call the unboxing „automatic“. I don’t see any reason it couldn’t be manual, and less magic could very well lead to a clearer model.

I updated the post to be less opinionated about who does the unboxing :)

dabrahams · November 8, 2020, 5:29am

Well, that's interesting for some purposes, but it doesn't solve my problem. I need an actual model of the protocol, e.g. that I can use to fulfill an associated type requirement.

protocol P0 {
  associatedtype A: P1
}

The existential type P1 can't be used as the A of a P0 model unless it actually conforms to P1. Just goes to show: the number of wrinkles in the things people hope to get from existentials is pretty impressive. To me it looks like a Pandora's box, and I'd really like someone to sort out all the monsters, give them names, and describe a workable programming model for them before we open it further.

dabrahams · November 8, 2020, 5:34am

What feature do you imagine can make that work without violating type soundness?

anandabits · November 8, 2020, 1:40pm

Without any type system enhancements, we could say returnAssoc can be invoked and the result treated as Any.

If path-dependent types are added then the result would have type p1.Q. This would allow the result to be used as input to methods on p1 that accept Q.