SE-0309: Unlock existential types for all protocols

I find myself somewhat confused while reading the discussions about the variances in Swift's types. Is there a general guide/rule somewhere that I can read up on what positions are considered by the type constructor as covariant, invariant, and contra-variant?

1 Like

These four should serve the purpose:

  • Tuple types are covariant in their element types.
  • Generic types are invariant in their generic parameters except for a select few Standard Library types (Optional, Array, Set and Dictionary), which are covariant.
  • inout types are invariant in their underlying type.
  • Function types are contravariant in their parameter types and covariant in their result type.

Unlike with conversions, note that the part about Optional, Array, Set and Dictionary being covariant does not fully apply in our case (the covariant relations are listed at the end of the Proposed Solution).

5 Likes

Was the omission of Set from the corresponding proposal text deliberate?

Yes, covariant type erasure in generic parameter position can cause us to run into the self-conformance problem when there is a conformance requirement.

2 Likes

Thanks!

Is there an explanation or rationale behind these 4 subtyping rules as well? Sorry if I'm asking too much. I want to understand the rules instead of just memorising them.

1 Like

There's a nice article by Mike Ash that talks about the basics - mikeash.com: Friday Q&A 2015-11-20: Covariance and Contravariance and there are some forum posts about it w.r.t generics here and here.

6 Likes

Subtyping rules are generally determined via the Liskov substitution principle, and then variance comes in for compound types to describe how subtyping in a component can relate to subtyping in the compound.

Following the substitution principle, it so happens that the parameter types of f2 must be supertypes of those of f1 and the result type of f2 must be a subtype of that of f1 for f2 to be a subtype of f1. Tuple types are just arrays of element types, and we can prove by induction that subtyping relations in components are propagated out to the tuple. A similar reasoning can be applied to the few Standard Library collections that are covariant by means of hardcoded built-in conversions. Other generic types are invariant in their generic parameters because the compiler does not know how to convert from one specialization to another in the general case. Regarding inout types, I think the reason they are invariant is that deep down, they resemble a generic pointer type like UnsafePointer, but I can't speak to whether UnsafePointer itself is deliberately kept invariant.

4 Likes

+1 to this, and hooray! Long awaited, much needed. Heterogenous [P] collections remain a vexing wall inside the maze of Swift, and this would easy some of the pain. So pleased to see this line of work proceeding at last.

Is there a snapshot build I can use to experiment with this? (Apologies if I missed the link.)

Two questions…er, well, question clusters:


First, is there any value in exposing non-covariant Self as Never rather than making the method disappear altogether? For example, the Equatable existential type would have:

static func == (lhs: Never, rhs: Never) -> Bool

I can imagine one might want to deal generically with method references — reflection, say, or generating diagnostics — without actually needing to call the method. I admit that I don’t have a specific compelling use case for this, and the harm it would do to the already-confusing diagnostic messages would be substantial. Still, worth asking.

I can also imagine that there might be situations where the compiler might be able to determine a better lower bound than Never from generic constraints etc. Does such a situation exist? (For example, the compiler might be able to deduce a less-restrictive bound for a private protocol, where all possible implementing types are known at compile time.)


The second question really just amounts to, “Do I understand the proposal correctly? If so, great!”

Tim Ekl’s nice post on this proposal got me thinking about this limits of this proposal. He considers this example:

protocol Shape {
    func duplicate() -> Self
    func matches(_ other: Self) -> Bool
}

Even with this proposal, the following doesn’t work:

func assertDuplicateMatches(s: Shape) {
    let t = s.duplicate() // t could be any Shape, not nec same as s
    assert(t.matches(s))  // ❌
    assert(s.matches(t))  // ❌
}

…which makes sense, because we need to guarantee to the type checker that s and t actually have the same type:

func assertDuplicateMatches<SpecificShapeType: Shape>(s: SpecificShapeType) {
    let t = s.duplicate() // t also has type SpecificShapeType
    assert(t.matches(s))  // ✅
    assert(s.matches(t))  // ✅
}

(Aside: I like the spelling any Shape better every time I see it in context. It would make the problem with the first version of assertDuplicateMatches much easier to see.)

However, if I understand correctly, the following code still won’t work even if this proposal is accepted:

let testShapes: [Shape] = […]
for shape in testShapes {
    assertDuplicateMatches(shape)
}

…because the compiler can’t determine a non-existential SpecificShapeType for the call to assertDuplicateMatches. Do I have that right?

Given that, I take it this implies the Shape existential does not actually conform to the Shape protocol? The proposal doesn’t say this explicitly, but it must be so. (Given that, what is the error message here? Seems like one that requires extra-special care.)

Given the above, I take it that this future direction in the proposal:

Make existential types "self-conforming" by automatically opening them when passed as generic arguments to functions. Generic instantiations could have them opened as opaque types.

…would make the code above work as written?

1 Like

Not yet (we'll post a link).

Not sure there is practical value in going that far. Besides, exposing non-covariant Self as Never would definitely introduce a source compatibility impediment once we start considering path-dependent types. It is possible to deduce Self internally in the event of a single conformance, but this kind of inference may leave unintended type-erasure unnoticed until the next conformance (also, having a non-public protocol around for the sake of a single conformance is something you should most likely avoid in production code).

Correct. The error message should be the good ol' P does not conform to P (the self-conformance issue).

1 Like

Library authors can already "open" existentials, which can be helpful in the absence of Self-conformances, albeit not being very elegant:

let testShapes: [Shape] = […]
for shape in testShapes {
    let unboxedShape = _openExistential(shape, do: { $0 })
 
    assertDuplicateMatches(unboxedShape)
}

You do not need to use _openExistential to open most existentials. An extension method on the protocol will suffice. _openExistential is only strictly necessary for existentials on whose constraints extensions are not allowed, such as Any and AnyObject.

I wasn’t aware path-dependent types were on the table, even the far-future hypothetical table! It seemed from the responses to the perennial “optionals should be like Kotlin” question that Swift was steering clear of them for the foreseeable future. Interesting.

Yeah, the more I think this through, the more I realize there’s not a concern here. The situations where you can infer a single specific type without actually having type constraints that narrow to that type just…aren’t useful. Thanks for humoring me!

Thanks, good to know. Again, thanks for humoring me.

I do worry about the usability of all these type system features in practice (while still supporting them, to be clear). Messages like P does not conform to P need to make way for more approachable diagnostics, or Swift will become too hostile an environment for people who aren’t PL geeks. Wasn’t there a lovely project a while back that demonstrated type errors with specific examples? That would help immensely here. And as always, I wish for tooling that makes far more robust use of static type info than just type errors and autocompletions, some sort of visualization / contextual annotation / something that makes these kinds of problem apparent before the error message even shows up. That and a flying carpet.

Yes! I almost included similar code in my OP, but decided it muddied my question too much. At least it is possible, if awkward.

Is there an approach as general _openExistential that doesn’t require writing one extension method per method you want to forward to the opened existential? If so, I’d be curious to see it!

Your mention of Kotlin just made me realize that "path-dependent" might sound misleading to some. What we really mean is a notion similar to the type identity in opaque types rather than "smart casts":

3 Likes

As @anthonylatsis, one approach would be to implicitly open an immutable existential binding's dynamic type when evaluating it, giving the effect of path-dependent types. That would mean that you could invoke a method on an existential let binding that produces a value of one of its associated types, and then re-apply that result to another method on the same existential binding, because we know the dynamic associated type will still match the type expected by the existential.

Another more explicit syntax might be to allow you to specify a new type variable to bind to the dynamic type in addition to the value, like:

func takesP(p: P) {
  let <T: P> x: T = p // now T refers to the dynamic type of p, and x has type T
  let b = x.b
  x.takesB(b)
}
3 Likes

Ahhh. Yes, I’d completely misunderstood! So here, there two types, “p1’s B” and “p2’s B,” which are implicit and may not even have names one can spell out explicitly in the code, but still exist as the single, unchanging static types of b1 and b2. Thus “path” meaning “of value propagation through expressions,” not “of control flow.” Thanks for the clarification.

3 Likes

Thank you for this proposal!

Please let me address the question of explicit syntax for existentials. Your proposal states:

So far, existentials are the only built-in abstraction model (on par with generics and opaque types) that doesn't have its own idiosyncratic syntax; they are spelled as the bare protocol name or a composition thereof.

The Swift style guide says that protocols should have a name that makes it clearly a protocol—an adjective or gerundive describing the capability or behavior represented by the protocol requirement: Hashable, Comparable, etc. This is not compiler-enforced syntax, but there is seldom any confusion about what is a protocol and what isn't.

the syntax strongly suggets that the protocol as a type and the protocol as a constraint are one thing

Existing Swift gives constraints an explicit syntax; it's clear they are not the same thing as using a protocol as a type. Consider:

protocol Processable {}
protocol Usable {}
func process<T: Processable>(_ item: T) -> Usable 

Processable is being used as a constraint, and Usable is not. It's not ambiguous.

If we change this to:

func process<T: Processable>(_ item: T) -> any Usable 

That's more ambiguous, not less, because now it makes sound like Usable is a constraint on any type that gets returned by the function, just like View is a constraint on some View that gets returned by a SwiftUI body. However this function won't return just "any" concrete type that conforms to Usable, it will always return the same exact type: Usable (the existential).

in practice, they serve different purposes, and this manifests most confusingly in the "Protocol (the type) does not conform to Protocol (the constraint)" paradox.

I agree it's confusingly worded, but it's not a paradox. There are other ways to explain this, where it makes perfect sense.

This could be qualified as a missing feature in the language; the bottom line is that the syntax is tempting developers when it should be contributing to weighted decisions.

I don't think syntax "tempts" people. It's just syntax. We should avoid the temptation to use syntax to prescribe behaviors.

If we are seeing people avoiding using certain language features, or being confused by them, I think it means we could do a better job of documenting the language & educating Swift developers about how the compiler and runtime work. I'd note that @xwu and their team has done a really excellent job of this lately, and as well there's a lot more resources available for learning the intricacies of Swift than there were a few years back.

Nonetheless, many Swift devs are still operating from assumptions we got from the "protocol-oriented programming" talk at WWDC five+ years ago, a talk which hardly anyone fully understood (at the time). Most of us walked away from it thinking "use structs and protocols more" but all the talk about static vs. dynamic dispatch sailed right over our heads.

I'm not saying the syntax can't be improved, but we should be careful about why we're doing it lest we make the problem actually worse and end up with a burdensome, overbearing language.

We should try to come up with a simpler metaphor to better express the concept of a "protocol-type object" and ultimately the dynamic vs. static dichotomy that is the elephant in the room.

Because using existential types is syntactically lightweight in comparison to using other abstractions, and similar to using a base class, the more possibilities they offer, the more users are vulnerable to unintended or inappropriate type erasure by following the path of initial least resistance.

I don't think this is why people are avoiding using generics and PATs. It has nothing to do with some code being easier to type.

The reason some people stick to using existentials is because when they tried to write code using PATs, they ran into compiler errors that were not straightforward to resolve, and then when they tried to use generics, they couldn't because you can't pass an existential into a generic parameter (unless it's an @objc existential without static requirements, a little-known exception to that rule although as of last summer it's in the official documentation).

By removing many of those errors, your proposal will unblock more people from using these features—no syntax change needed.

Now, if the decision to change the existing syntax in a breaking way is made, I hope we can do it in a way that helps clarify the language and make it easier to have conversations about it. I might suggest:

  • the new syntax should be a single word that could be used interchangeably with "existential" or "protocol-type object" in a sentence, while not sounding confusing
  • the new syntax could ideally make it easier for Swift devs to recognize when static vs. dynamic type information is relied on
  • the new syntax should make it unambiguous when reference semantics are likely incurred

Some examples I could think of:

  • var foo: Proto<Fooable>
  • var foo: Object<Fooable>
  • var foo: dynamic Fooable

I like Object<Fooable> because it makes it obvious that protocol-type objects will incur reference semantics just like any other object (yes there are some exceptions but they're basically just a compiler optimization when the object is tiny). It also makes it obvious that we're dealing with a concrete type that allows no further generic specialization, and from which no further static type information can be "opened" without dynamic casting or type erasure schemes.

The problem with "dynamic" is that it's too overloaded. Swift already has the attribute declaration modifier dynamic, which has an explicit definition as "dynamically dispatched using the Objective-C runtime". However @nonobjc existentials can still be dynamically dispatched, and the attributes @dynamicMemberLookup and @dynamicCallable don't seem to be directly related to the Obj. C runtime. So what does "dynamic" really mean? I find it confusing; clarifying this would be great but not sure if it's worth breaking syntax.

Hope this could be helpful to your proposal, and thanks again, really looking forwards to this being in the language. Even if it does still have some limitations, I think it reduces the number of limitations we have to deal with and will encourage people to revisit their attitudes towards PATs. Thanks!

2 Likes

OK.

Many Swift users don't know the difference between func f<T: P>(parameter: T) and func f(parameter: P) — I too was one of those people at some point. The confusion stems from the lack of an indication that parameter is bound to the existential type of P; any P should hopefully change that.

Do you mean that users could perhaps conflate () -> any Usable and <^T: Usable>() -> T — the latter syntax is borrowed from here?

To me, the any-some distinction, between existentials and generics respectively, makes sense. Namely, some Hashable indicates that a value of some (a singular, specific) Hashable-conforming type will be returned. On the contrary, any Hashable indicates that any (an unspecified type, not always the same) Hashable-conforming type can be returned (and wrapped in the existential), meaning we can do this: (Bool.random() ? "" ? 0) as any Hashable but can't do the same with some.

That was metaphorical, but unfortunate wording on our part, nonetheless.

I disagree. Syntactic friction can encourage certain styles and discourage others; for example, requiring that existentials be written out as ThisIsAnExistentialOfProtocol<Hashable> would result in many preferring <T: Hashable> T where possible, choosing the path of least resistance.

Documentation can help but it's not enough.

To be clear, we don't propose altering the existential syntax; let a: any Hashable will be invalid.

Swift tends to avoid abbreviations.

I think these would be poor name choices.

The only benefit I see to Object<Protocol> is the fact that it's a concrete type; however, that would be undermined if we enabled func f(_: Optional<some Hashable>). Nevertheless, I don't think naming should be based on implementation details (existentials don't have reference semantics — mutating one doesn't change another), not to mention that many users would think that Object is referring solely to classes. (If we use this syntax, then Array which isn't backed by value types, should be called ArrayObject.)

2 Likes

I think having a particular syntax for existentials of PATs (such as any P) is a very good idea. Not being able to equate two instances of Equatable would be very confusing. Having the type name bare makes PATs look just like superclasses despite working completely different.

Though, if we were going to introduce a new syntax for existentials, we think it'd be much less confusing if we took the potentially source-breaking path and did so uniformly, deprecating the existing syntax after a late-enough language version, than to have yet another attribute and two syntaxes where one only works some of the time. We also believe that drawing a tangible line between protocols that "do" and "do not" have limited access to their API is ill-advised due to the relative nature of this phenomenon.

I think this is a bad idea; we currently have two very different kinds of protocols, and I think most of the confusion is caused by the fact that they look the same. Instead of trying to make two different things work the same, we should just make the distinction clearer.

Although not currently proposed, I agree with you and others upthread that requiring, or at least offering, the any syntax for the newly unlocked existential types will help us transition away from the current, in my opinion, unclear syntax.

2 Likes