Lifting the "Self or associated type" constraint on existentials

I 100% agree that associated types are not “to blame”—that's why I've been saying we shouldn't merely lift the restriction without imposing some others. One of the major problems with existentials is that you can easily fall off a cliff where the existential type no longer exists/has an API matching the declared protocol. Although you've proposed to remove the first problem, you'd make the second problem worse, and I think it's easily as bad as the first problem. If one had to explicitly declare:

  1. whether the existential should conform to the protocol
  2. whether the existential's API must match that of the protocol
  3. in the case the existential's API needn't match that of the protocol, which parts of the protocol's API should be available on the existential

and the compiler would forbid requirements and extensions that violate those declarations, then we'd have a sane world where “blame” was always correctly attributed.

In at least the second and third cases, I think to avoid confusion the existential should not have the same spelling as the unadorned protocol name.

I don't think this makes anything worse regarding any of your three points. Right now, no existential conforms to its protocol (discounting AnyObject or some @objc protocols), and that doesn't change. To be resilient, it will have to be opt-in when we do support it, addressing #1. #2 seems like the same point as #1 to me; maybe you can clarify the distinction to me. As for #3, even with the existing restriction, thanks to protocol extensions, it's already the case that an existential's API can diverge from the conforming type's. In the fullness of time, when we support opening existentials, I also think that distinction will disappear—fundamentally, the only operation an existential supports is opening it and manipulating the underlying dynamic value; once the value is opened, the entire protocol API is available relative to the dynamic type of that value. Like I said, I'm all for making existentials more explicit.

2 Likes

I appreciate the argument you’re making and am still thinking about it, but...

In a language with fully generalized existentials and protocol extensions in any module it seems like declaring this fully would get complex and verbose pretty quickly. Remember that there is not just one existential, but for many protocols an arbitrary number of them with different constraints.

This approach also seems likely to lead to unnecessarily limited existentials - APIs that are perfectly valid to expose given the constraints on the existential that are not exposed because the declaration making them available is not present (and depending on details, maybe cannot be added retroactively by a 3rd party).

Overall, my instinct is that while this model might be more explicit about what is going on I think it is unlikely to increase clarity. It introduces distinctions that add complexity to the semantics as well as well as the surface of the language. I’m not sure this would be a net gain for anyone and could be a net loss, especially if it makes the language feel verbose or finicky.

It seems to me that there is an analogy with type inference here. It can be confusing at times but overall it is a huge net win to let the compiler do some work for us. Clarity is usually improved increased by relying on inference. I think the same is likely to be true in the case of inferring the API that is available on an existential, at least in most cases.

However, it is worth noticing that in the case of type inference we always have the option to be explicit by adding type annotations. I wonder if a similar model of optional annotations might be useful to add clarity of intent as well as get more localized and precise feedback from the compiler when that intent doesn’t type check.

2 Likes

You can call .init() on an existential metatype of type P.Type and it will return an instance of P.

Right, but not on P.Protocol to construct a value of type P independent of some specific implementation.

4 Likes

Maybe it's just my overly-literal brain, but that statement confuses me. I didn't claim that the proposal made any of the three items listed worse, nor were those three items “points” I was trying to make. I suggested three things that should be made explicit. If what you're really saying is that today those things are implicit and they would stay that way under the proposal, then we agree.

I also claim keeping those things implicit while opening up the space of existential usage would have some negative effects, which I'm not going to repeat here. Further, if we were to talk about my items as points that could possibly be made worse by a language change, then in fact your proposal would make #3 worse, by expanding the set of a protocol's APIs that an available existential could not support.

Right now, no existential conforms to its protocol (discounting AnyObject or some @objc protocols), and that doesn't change. To be resilient, it will have to be opt-in when we do support it, addressing #1. #2 seems like the same point as #1 to me; maybe you can clarify the distinction to me.

#2 and #1 are only the same if you consider conformance to P to be part of the declared API of P itself. To me that isn't obviously the case given our syntax. I'd expect to see that spelled something like:

protocol P : Self {}

But I would be happy to consider a design that didn't make a distinction between #1 and #2, since #3 and #1 together cover the entire space of possibilities.

As for #3, even with the existing restriction, thanks to protocol extensions, it's already the case that an existential's API can diverge from the conforming type's.

Yes, and I claim we should take that possibility away or force it to be explicit where that happens, even if it means breaking source.

In the fullness of time, when we support opening existentials, I also think that distinction will disappear—fundamentally, the only operation an existential supports is opening it and manipulating the underlying dynamic value; once the value is opened, the entire protocol API is available relative to the dynamic type of that value. Like I said, I'm all for making existentials more explicit.

Except that I'm sure you are not proposing to require explicitly opening every existential before using any of its APIs. The basic problems remain: some operations will be available implicitly, and others will only be available when the existential is explicitly opened, and exactly which will be based on obscure rules of type soundness. When you reach a point where an existential must be opened, your code may not even be structured so that the type to open the existential as is available to you, because the language has provided a way to code yourself into a very attractive, but deep, and almost invisible, hole.

1 Like

Don't know under which category this would fall under, but in the interest of providing examples, here's one long and rocky journey in trying to further restrict the types defined in a protocol: Journey into PAT-land

1 Like

Purely from a diagnostic point of view:

I think that the wall is in the wrong place and actively prevents any kind of learning about what the actual limitations are and why. Indeed, your explanation using an example protocol (here) is exactly the kind of diagnosis that we should provide and aren't able to currently. "I can't call init, here's why", "I can't call a function taking an associated type argument, here's why" are the very next steps in learning about this wrinkle of the type system. And since that wrinkle can't ever be entirely ironed out, providing a path to learning about it is helpful disclosure.

Any such diagnostics would also usually be just a few additional lines of written code away. Right now you get an error at declaring func foo(_ p: P), but the very next thing you are generally going to do as a programmer is to try to use p in the body of that function. So I just don't see the deep well or the big redesign that is going to be saved. Instead, I see at least a few people experimenting in Playgrounds and actually comprehending for themselves the limitations implicit in existentials, and a great many more people being able to understand Stack Overflow (et al) answers that include these sorts of examples because they can be shown "see, it's impossible to call f here, because..."

In short, we enhance learning by providing a steepening slope with sign posts rather than a poorly labeled wall, and the additional journey up the slope really isn't that far nor is it a wasted journey.

13 Likes

I get your point about the diagnostics revealing what's going on, and it's a good one. The current wall is definitely not the best service we can provide. I note, however, that with an explicit declaration of intent as I have proposed, if you had declared the protocol to have a self-conforming existential type, or an existential with matching API, or to have specific APIs available on its existential type, you'd get an equally descriptive error for each declared requirement or extension that conflicts with the intent. An earlier error, to be handled by the author of the protocol, who is already— in most cases—going to be more sophisticated and capable of fixing the problem than the protocol's clients will be.

Now, I think your foo example is a bit…selective in its exploration of the scenario. For parameters of existential type, if the compiler were to simply prevent foo from using any of the protocol's APIs at all, that would give us essentially the capabilities we have today, but with better diagnostics. The problem with what's being proposed is that it will let foo use some fraction of the protocol's APIs, and that fraction is implicit.

When you write a function signature like foo's, that's a two-sided contract. It says something about what types can be passed by clients, but it also says something about what APIs will be available on the parameters for use by foo and its future revisions, without breaking any source code. Under the proposal, though, the API being claimed for future use by foo will not be apparent to the author of foo or to its maintainers without a type soundness analysis.

The big hole only becomes apparent after foo is being used by other code and its maintainer discovers she needs access to some of the implicitly omitted API of P on the existential. But now she can't change the API of foo without breaking some other code.

The other reason I find the example selective is that it doesn't consider that it will be very tempting to create and return instances of existential types. Today they're syntactically lighter than the proposed syntax for an opaque result type, and there's nothing at all in the code of the function returning the existential that is likely to exercise the API of the existential; usually it will be handling the whole concrete type and letting implicit conversion create the existential instance on return. It's simply not apparent anywhere in code what API is being vended for the result, and the current syntax strongly implies an API that will not match reality.

I can easily imagine how fastidious programmers will work around these problems: they'll write a type-erasing wrapper for the existential that vends the API they expect to be available on the existential by forwarding to its stored property, and compile this wrapper into their tests. They don't have to actually use the wrapper, but if it compiles, they now have a reliable in-code reference for the existential type's available API. This extralinguistic type checking is reminiscent of a horrible thing all C++ generic programmers “should,” but almost never actually, do: create and use concept archetypes. We'd also have to declare this availability on the existential type in the documentation of protocol requirements and extensions. I consider it a huge red flag that people like me would be driven to do these things when we could just declare the same things in source code.

2 Likes

This seems like the right approach to me, instead of generating errors when trying to use those members.

I agree for the most part, but I disagree that we should keep a strong dividing line between PATs and other protocols. I don't see the missing functions as confusing as long as the diagnostics are clearly written and they state what information needs to be provided to successfully use the function.

I would rather see some simple syntax for easy (partial) type erasure. For example, if we allowed type aliases with the same name, but different numbers of generic parameters, then we could define a partially type erased P as such:

typealias P<A> = P where P.A == A  //We can use f on P<A>

Notice that this also gives us a way to define default parameters:

typealias Result<T> = Result<T,Swift.Error>

Let's look at what the proper error messages would be in these cases, and the ideal way that the programmer could fix them.

If you tried to call func f(_: A) -> A on a variable of type P, I think the error message should have the gist of "P.A is unknown in this case. Variable myP must be cast to a type where P.A is known before f can be called".

If we were able to define either type aliases or variables where we stipulate A, then we are able to call f again:

var myP: P<A> = ...
let x = myP.f(3) //No problem here because we have defined what A is

Init() is different because we really need an exact type to know what to call... and that is what the error message should say something like: "Unable to call init on P without knowing which instance type to create. Use init on a specific type conforming to P instead"

That said, in the fullness of time, I would really like to be able to define a factory init on P that is used for P(). This would have an explicit annotation, of course, specifying that it is a factory method/init...

4 Likes

I think all this talk of the existential having an API is fundamentally misleading. An existential doesn't have an API of its own—the only primitive operation is to open the existential and perform operations on the underlying value inside. I think that, once you internalize that, the consequences for what you can do with the existential can be reasoned about from there. It's not that certain members do or don't exist on the existential type, it's whether it's type-safe to invoke them. If you try to invoke a method with arguments of Self type, or of non-same-type-constrained associated types, by passing in unrelated existential values, then (if we diagnose it right) it's understandably a type error, because the dynamic types inside the existentials aren't known to match. That understanding also generalizes well to when we have more expressive existential-opening operations in the future that allow the dynamic types of existentials to be reasoned about beyond the scope of individual method invocations.

12 Likes

This. Maybe I don't understand the type-theory behind it well enough, but the whole idea of existentials being their own "thing" with their own protocol conformances (including conformance to the protocol they represent) is bizarre to me.

Throughout this discussion I've been thinking that the ultimate answer to all of these problems would be path-dependent types. A function takes a parameter of type "any P", and Self/associated types are relative to whatever type is ultimately boxed inside that existential, and you work with it on that level of abstraction.

Of course, that's not a trivial thing. The type inside of an existential may change, so we'd need to figure out a solution for that:

var c: Collection where Element == Int
c = [1,2,3]
let idx = c.startIndex // type: c.Index
c = Set([4,5,6])
c[idx] // uh-oh!
7 Likes

Yeah, it's my hope that that's eventually where we land with existentials. The handling of changing dynamic types can be seen as a flow-sensitive typing problem, similar to definite initialization—once you've formed a value that's dependent on the dynamic type of c, the type becomes fixed, so you wouldn't be allowed to reassign it to anything that wasn't of c.Self type after that point (or, alternatively, any types dependent on the original dynamic type could be invalid to use after you do so).

1 Like

I'm afraid your post misses my point. I am not at all suggesting that we should “keep a strong dividing line between PATs and other protocols.” I am saying that if we remove the line, we should deal with problems that are currently prevented or mitigated by having the line in place.

Also, while diagnostics are important, they are by no means the whole story. It should be possible to look at an API and its documentation and understand how to use it correctly, without having to use it wrong and get a slap from the compiler. There is already too much implicit in the way protocols are declared and documentation is generated for them. It should be easy to see which requirements a conforming type really needs to implement; likewise it should be easy to tell how to legally use the protocol's existential type.

Joe, I think you are falling victim here to your own extraordinary capacity for abstraction, in the same way that you personally don't need much of an algorithm library because you can open code most of them correctly from first principles. Yes, one can reason through “the consequences for what you can do with an existential” but to most of us, that isn't going to be immediately evident by looking at the protocol, and for those of us who can manage it, puzzling through type safety consequences to derive the usable API is going to be a drag on useful development activity.

Saying “it's not that certain members do or don't exist on the existential type, it's whether it's type-safe to invoke them” strikes me as a distinction without a difference. Would you prefer to say that certain members are “unavailable” on the existential? I think that's language I've used repeatedly in this thread. If you want to draw an analogy to other parts of the language, it's immediately obvious from looking at a declaration which members are available on a constant instance or rvalue. You can't call the ones marked mutating. The same should be true of protocol members and the members available on existential instances.

3 Likes

As I mentioned previously, in the world of generalized existential there is not just one existential type for a protocol. There will often be an unbounded number of existential types with different constraints. I can't imagine how you intend to declare clearly which members are available on all of these potential existential types.

Generalized existentials are an extremely powerful feature that includes a nontrivial amount of essential complexity. Learning how to work with them will necessarily involve some effort. I don't see any way around this.

I don't think this means we should not support the feature or be afraid that users won't be able to use it effectively (one thing I love about Swift is that it has mostly avoided that kind of paternalism). I think it means we need put more effort than usual into educating people about this feature, how to use it well, and what pitfalls or traps they to be aware of. This is especially necessary as generalized existentials is a relatively sophisticated feature for which I don't think there is a clear point of reference in the previous experience of most programmers.

4 Likes

I would be very alarmed if this and @Joe_Groff’s model were to the desired eventual design of Swift existentials as exposed to end users. To say that something spelled as a type isn’t a type, has no members, or is in fact an innumerable multitude of types just doesn’t pass the smell test for usability. I would go so far as to argue that if such a feature cannot be exposed without such a complicated model for the type system then it probably doesn’t belong in a general-use programming language.

Perhaps you misunderstood what I intend here. What I intend to communicate is that Collection, Collection where Element == String, Collection where Index == Int, Collection where Element: Foo, etc are all different existential types derived from the Collection protocol.

If we don't allow this level of generality we are greatly selling ourselves short and unnecessarily restricting the expressivity of our type system.

4 Likes

I'm not sure where you get that from the discussion; who's saying "something spelled as a type isn't a type"? Each of the "innumerable multitude of types" Matthew alludes to would have a distinct spelling; I don't think anyone's suggesting we implicitly infer arbitrary bounds for existentials.

I fear that all this talk about future directions and complications is causing the discussion to spiral out of control. Let's get back to what's being proposed. I agree that existentials are confusion, and probably deserve to be deemphasized, but they exist today, we aren't going to get rid of them, and we aren't going to break source compatibility. The existing restrictions on existentials don't accurately reflect where the complications are, and don't prevent users from confronting the complexities when protocol extensions are involved—they don't really save anyone from staring into the abyss. Meanwhile, the existing restrictions threaten to impose more complexity on other parts of the language, and impede users from doing reasonable things they want to be able to do. That's why I think it's worthwhile to lift the restriction, no matter where the language may go in the future.

22 Likes