Lifting the "Self or associated type" constraint on existentials

Slava_Pestov · November 26, 2018, 7:33pm

You can call .init() on an existential metatype of type P.Type and it will return an instance of P.

Joe_Groff · November 26, 2018, 7:34pm

Right, but not on P.Protocol to construct a value of type P independent of some specific implementation.

dabrahams · November 26, 2018, 7:56pm

Maybe it's just my overly-literal brain, but that statement confuses me. I didn't claim that the proposal made any of the three items listed worse, nor were those three items “points” I was trying to make. I suggested three things that should be made explicit. If what you're really saying is that today those things are implicit and they would stay that way under the proposal, then we agree.

I also claim keeping those things implicit while opening up the space of existential usage would have some negative effects, which I'm not going to repeat here. Further, if we were to talk about my items as points that could possibly be made worse by a language change, then in fact your proposal would make #3 worse, by expanding the set of a protocol's APIs that an available existential could not support.

Right now, no existential conforms to its protocol (discounting AnyObject or some @objc protocols), and that doesn't change. To be resilient, it will have to be opt-in when we do support it, addressing #1. #2 seems like the same point as #1 to me; maybe you can clarify the distinction to me.

#2 and #1 are only the same if you consider conformance to P to be part of the declared API of P itself. To me that isn't obviously the case given our syntax. I'd expect to see that spelled something like:

protocol P : Self {}

But I would be happy to consider a design that didn't make a distinction between #1 and #2, since #3 and #1 together cover the entire space of possibilities.

As for #3, even with the existing restriction, thanks to protocol extensions, it's already the case that an existential's API can diverge from the conforming type's.

Yes, and I claim we should take that possibility away or force it to be explicit where that happens, even if it means breaking source.

In the fullness of time, when we support opening existentials, I also think that distinction will disappear—fundamentally, the only operation an existential supports is opening it and manipulating the underlying dynamic value; once the value is opened, the entire protocol API is available relative to the dynamic type of that value. Like I said, I'm all for making existentials more explicit.

Except that I'm sure you are not proposing to require explicitly opening every existential before using any of its APIs. The basic problems remain: some operations will be available implicitly, and others will only be available when the existential is explicitly opened, and exactly which will be based on obscure rules of type soundness. When you reach a point where an existential must be opened, your code may not even be structured so that the type to open the existential as is available to you, because the language has provided a way to code yourself into a very attractive, but deep, and almost invisible, hole.

Moximillian · November 26, 2018, 8:48pm

Don't know under which category this would fall under, but in the interest of providing examples, here's one long and rocky journey in trying to further restrict the types defined in a protocol: Journey into PAT-land

gregtitus · November 26, 2018, 10:15pm

Purely from a diagnostic point of view:

I think that the wall is in the wrong place and actively prevents any kind of learning about what the actual limitations are and why. Indeed, your explanation using an example protocol (here) is exactly the kind of diagnosis that we should provide and aren't able to currently. "I can't call init, here's why", "I can't call a function taking an associated type argument, here's why" are the very next steps in learning about this wrinkle of the type system. And since that wrinkle can't ever be entirely ironed out, providing a path to learning about it is helpful disclosure.

Any such diagnostics would also usually be just a few additional lines of written code away. Right now you get an error at declaring func foo(_ p: P), but the very next thing you are generally going to do as a programmer is to try to use p in the body of that function. So I just don't see the deep well or the big redesign that is going to be saved. Instead, I see at least a few people experimenting in Playgrounds and actually comprehending for themselves the limitations implicit in existentials, and a great many more people being able to understand Stack Overflow (et al) answers that include these sorts of examples because they can be shown "see, it's impossible to call f here, because..."

In short, we enhance learning by providing a steepening slope with sign posts rather than a poorly labeled wall, and the additional journey up the slope really isn't that far nor is it a wasted journey.

dabrahams · November 27, 2018, 1:05am

gregtitus:

I think that the wall is in the wrong place and actively prevents any kind of learning about what the actual limitations are and why. Indeed, your explanation using an example protocol (here) is exactly the kind of diagnosis that we should provide and aren't able to currently. "I can't call init , here's why", "I can't call a function taking an associated type argument, here's why" are the very next steps in learning about this wrinkle of the type system. And since that wrinkle can't ever be entirely ironed out, providing a path to learning about it is helpful disclosure.

Any such diagnostics would also usually be just a few additional lines of written code away. Right now you get an error at declaring func foo(_ p: P) , but the very next thing you are generally going to do as a programmer is to try to use p in the body of that function. So I just don't see the deep well or the big redesign that is going to be saved.

I get your point about the diagnostics revealing what's going on, and it's a good one. The current wall is definitely not the best service we can provide. I note, however, that with an explicit declaration of intent as I have proposed, if you had declared the protocol to have a self-conforming existential type, or an existential with matching API, or to have specific APIs available on its existential type, you'd get an equally descriptive error for each declared requirement or extension that conflicts with the intent. An earlier error, to be handled by the author of the protocol, who is already— in most cases—going to be more sophisticated and capable of fixing the problem than the protocol's clients will be.

Now, I think your foo example is a bit…selective in its exploration of the scenario. For parameters of existential type, if the compiler were to simply prevent foo from using any of the protocol's APIs at all, that would give us essentially the capabilities we have today, but with better diagnostics. The problem with what's being proposed is that it will let foo use some fraction of the protocol's APIs, and that fraction is implicit.

When you write a function signature like foo's, that's a two-sided contract. It says something about what types can be passed by clients, but it also says something about what APIs will be available on the parameters for use by foo and its future revisions, without breaking any source code. Under the proposal, though, the API being claimed for future use by foo will not be apparent to the author of foo or to its maintainers without a type soundness analysis.

The big hole only becomes apparent after foo is being used by other code and its maintainer discovers she needs access to some of the implicitly omitted API of P on the existential. But now she can't change the API of foo without breaking some other code.

The other reason I find the example selective is that it doesn't consider that it will be very tempting to create and return instances of existential types. Today they're syntactically lighter than the proposed syntax for an opaque result type, and there's nothing at all in the code of the function returning the existential that is likely to exercise the API of the existential; usually it will be handling the whole concrete type and letting implicit conversion create the existential instance on return. It's simply not apparent anywhere in code what API is being vended for the result, and the current syntax strongly implies an API that will not match reality.

I can easily imagine how fastidious programmers will work around these problems: they'll write a type-erasing wrapper for the existential that vends the API they expect to be available on the existential by forwarding to its stored property, and compile this wrapper into their tests. They don't have to actually use the wrapper, but if it compiles, they now have a reliable in-code reference for the existential type's available API. This extralinguistic type checking is reminiscent of a horrible thing all C++ generic programmers “should,” but almost never actually, do: create and use concept archetypes. We'd also have to declare this availability on the existential type in the documentation of protocol requirements and extensions. I consider it a huge red flag that people like me would be driven to do these things when we could just declare the same things in source code.

hartbit · November 27, 2018, 6:08am

This seems like the right approach to me, instead of generating errors when trying to use those members.

Jon_Hull · November 27, 2018, 8:17am

I agree for the most part, but I disagree that we should keep a strong dividing line between PATs and other protocols. I don't see the missing functions as confusing as long as the diagnostics are clearly written and they state what information needs to be provided to successfully use the function.

I would rather see some simple syntax for easy (partial) type erasure. For example, if we allowed type aliases with the same name, but different numbers of generic parameters, then we could define a partially type erased P as such:

typealias P<A> = P where P.A == A  //We can use f on P<A>

Notice that this also gives us a way to define default parameters:

typealias Result<T> = Result<T,Swift.Error>

dabrahams:

As for the fundamental reasons for this difference, it's what I said in the talk: capturing an instance of type T as an existential type erases type information, and in particular, type relationships. If you work through a couple of examples with a protocol like:
protocol P {
   init()
   associatedtype A
   func f(_: A) -> A
}
you'll see that the most specific common supertype of types conforming to P has no usable API. The compiler can't provide a working init() because it has no way to know which subtype to create. I suppose it could provide a trapping init() . But it can't even provide a trapping f because that would have to take a type that is a subtype of every conforming type's A and return a type that is a supertype of every conforming type's A . In a world where Never was a true bottom type, that could be func f(_: Never) -> Any , but of course even that doesn't satisfy the requirements for f , which say it has matching parameter and return types.

Let's look at what the proper error messages would be in these cases, and the ideal way that the programmer could fix them.

If you tried to call func f(_: A) -> A on a variable of type P, I think the error message should have the gist of "P.A is unknown in this case. Variable myP must be cast to a type where P.A is known before f can be called".

If we were able to define either type aliases or variables where we stipulate A, then we are able to call f again:

var myP: P<A> = ...
let x = myP.f(3) //No problem here because we have defined what A is

Init() is different because we really need an exact type to know what to call... and that is what the error message should say something like: "Unable to call init on P without knowing which instance type to create. Use init on a specific type conforming to P instead"

That said, in the fullness of time, I would really like to be able to define a factory init on P that is used for P(). This would have an explicit annotation, of course, specifying that it is a factory method/init...

Joe_Groff · November 27, 2018, 5:49pm

I think all this talk of the existential having an API is fundamentally misleading. An existential doesn't have an API of its own—the only primitive operation is to open the existential and perform operations on the underlying value inside. I think that, once you internalize that, the consequences for what you can do with the existential can be reasoned about from there. It's not that certain members do or don't exist on the existential type, it's whether it's type-safe to invoke them. If you try to invoke a method with arguments of Self type, or of non-same-type-constrained associated types, by passing in unrelated existential values, then (if we diagnose it right) it's understandably a type error, because the dynamic types inside the existentials aren't known to match. That understanding also generalizes well to when we have more expressive existential-opening operations in the future that allow the dynamic types of existentials to be reasoned about beyond the scope of individual method invocations.

Karl · November 27, 2018, 7:17pm

This. Maybe I don't understand the type-theory behind it well enough, but the whole idea of existentials being their own "thing" with their own protocol conformances (including conformance to the protocol they represent) is bizarre to me.

Throughout this discussion I've been thinking that the ultimate answer to all of these problems would be path-dependent types. A function takes a parameter of type "any P", and Self/associated types are relative to whatever type is ultimately boxed inside that existential, and you work with it on that level of abstraction.

Of course, that's not a trivial thing. The type inside of an existential may change, so we'd need to figure out a solution for that:

var c: Collection where Element == Int
c = [1,2,3]
let idx = c.startIndex // type: c.Index
c = Set([4,5,6])
c[idx] // uh-oh!

Joe_Groff · November 27, 2018, 7:29pm

Yeah, it's my hope that that's eventually where we land with existentials. The handling of changing dynamic types can be seen as a flow-sensitive typing problem, similar to definite initialization—once you've formed a value that's dependent on the dynamic type of c, the type becomes fixed, so you wouldn't be allowed to reassign it to anything that wasn't of c.Self type after that point (or, alternatively, any types dependent on the original dynamic type could be invalid to use after you do so).

dabrahams · November 27, 2018, 7:35pm

I'm afraid your post misses my point. I am not at all suggesting that we should “keep a strong dividing line between PATs and other protocols.” I am saying that if we remove the line, we should deal with problems that are currently prevented or mitigated by having the line in place.

Also, while diagnostics are important, they are by no means the whole story. It should be possible to look at an API and its documentation and understand how to use it correctly, without having to use it wrong and get a slap from the compiler. There is already too much implicit in the way protocols are declared and documentation is generated for them. It should be easy to see which requirements a conforming type really needs to implement; likewise it should be easy to tell how to legally use the protocol's existential type.

Joe_Groff:

I think all this talk of the existential having an API is fundamentally misleading. An existential doesn't have an API of its own—the only primitive operation is to open the existential and perform operations on the underlying value inside. I think that, once you internalize that, the consequences for what you can do with the existential can be reasoned about from there. It's not that certain members do or don't exist on the existential type, it's whether it's type-safe to invoke them. If you try to invoke a method with arguments of Self type, or of non-same-type-constrained associated types, by passing in unrelated existential values, then (if we diagnose it right) it's understandably a type error, because the dynamic types inside the existentials aren't known to match.

Joe, I think you are falling victim here to your own extraordinary capacity for abstraction, in the same way that you personally don't need much of an algorithm library because you can open code most of them correctly from first principles. Yes, one can reason through “the consequences for what you can do with an existential” but to most of us, that isn't going to be immediately evident by looking at the protocol, and for those of us who can manage it, puzzling through type safety consequences to derive the usable API is going to be a drag on useful development activity.

Saying “it's not that certain members do or don't exist on the existential type, it's whether it's type-safe to invoke them” strikes me as a distinction without a difference. Would you prefer to say that certain members are “unavailable” on the existential? I think that's language I've used repeatedly in this thread. If you want to draw an analogy to other parts of the language, it's immediately obvious from looking at a declaration which members are available on a constant instance or rvalue. You can't call the ones marked mutating. The same should be true of protocol members and the members available on existential instances.

anandabits · November 27, 2018, 7:57pm

As I mentioned previously, in the world of generalized existential there is not just one existential type for a protocol. There will often be an unbounded number of existential types with different constraints. I can't imagine how you intend to declare clearly which members are available on all of these potential existential types.

Generalized existentials are an extremely powerful feature that includes a nontrivial amount of essential complexity. Learning how to work with them will necessarily involve some effort. I don't see any way around this.

I don't think this means we should not support the feature or be afraid that users won't be able to use it effectively (one thing I love about Swift is that it has mostly avoided that kind of paternalism). I think it means we need put more effort than usual into educating people about this feature, how to use it well, and what pitfalls or traps they to be aware of. This is especially necessary as generalized existentials is a relatively sophisticated feature for which I don't think there is a clear point of reference in the previous experience of most programmers.

xwu · November 27, 2018, 8:10pm

I would be very alarmed if this and @Joe_Groff’s model were to the desired eventual design of Swift existentials as exposed to end users. To say that something spelled as a type isn’t a type, has no members, or is in fact an innumerable multitude of types just doesn’t pass the smell test for usability. I would go so far as to argue that if such a feature cannot be exposed without such a complicated model for the type system then it probably doesn’t belong in a general-use programming language.

anandabits · November 27, 2018, 8:13pm

Perhaps you misunderstood what I intend here. What I intend to communicate is that Collection, Collection where Element == String, Collection where Index == Int, Collection where Element: Foo, etc are all different existential types derived from the Collection protocol.

If we don't allow this level of generality we are greatly selling ourselves short and unnecessarily restricting the expressivity of our type system.

Joe_Groff · November 27, 2018, 8:14pm

I'm not sure where you get that from the discussion; who's saying "something spelled as a type isn't a type"? Each of the "innumerable multitude of types" Matthew alludes to would have a distinct spelling; I don't think anyone's suggesting we implicitly infer arbitrary bounds for existentials.

Joe_Groff · November 27, 2018, 8:24pm

I fear that all this talk about future directions and complications is causing the discussion to spiral out of control. Let's get back to what's being proposed. I agree that existentials are confusion, and probably deserve to be deemphasized, but they exist today, we aren't going to get rid of them, and we aren't going to break source compatibility. The existing restrictions on existentials don't accurately reflect where the complications are, and don't prevent users from confronting the complexities when protocol extensions are involved—they don't really save anyone from staring into the abyss. Meanwhile, the existing restrictions threaten to impose more complexity on other parts of the language, and impede users from doing reasonable things they want to be able to do. That's why I think it's worthwhile to lift the restriction, no matter where the language may go in the future.

gwendal.roue · November 27, 2018, 8:33pm

I no longer have the same view of f(x: P), thanks to this discussion. It is useful, if only for the amount of information about existentiels available here! This forum is great.

nuclearace · November 27, 2018, 8:33pm

I think what we really need is an implementation to toy around with. And with the fullness of time, maybe we should also prototype out some possible solutions to the various issues raised here. But I feel without something concrete, we’re just talking about what if’s and possible futures.

xwu · November 27, 2018, 9:01pm

Joe_Groff:

I fear that all this talk about future directions and complications is causing the discussion to spiral out of control. Let's get back to what's being proposed. I agree that existentials are confusion, and probably deserve to be deemphasized, but they exist today, we aren't going to get rid of them, and we aren't going to break source compatibility. The existing restrictions on existentials don't accurately reflect where the complications are, and don't prevent users from confronting the complexities when protocol extensions are involved—they don't really save anyone from staring into the abyss. Meanwhile, the existing restrictions threaten to impose more complexity on other parts of the language, and impede users from doing reasonable things they want to be able to do. That's why I think it's worthwhile to lift the restriction, no matter where the language may go in the future.

I think looking at the future direction is helpful here because, if @dabrahams's suggestions were to be pursued, it is entirely possible to lay out a source-compatible path to get there:

Allow all existentials to be spelled P that have no Self or associated type requirements (status quo)
Allow all existentials to be spelled Any<P> (strawman syntax) without the "Self or associated type" restriction, but with a limited set of members (or no members) unless opened
Allow any protocol with Self or associated type requirements to be annotated as @existential (strawman syntax) to opt into its corresponding existential being spelled P, essentially retroactively making the status quo a case of an implicit annotation (this kind of evolution is precedented in how we handle implicitly derived Equatable conformance for enums, for instance)

This sort of evolution would be precluded forever if we simply lifted the "Self or associated type" restriction today without planning ahead.