SE-0309: Unlock existential types for all protocols

anthonylatsis · April 26, 2021, 10:17pm

Not yet (we'll post a link).

Not sure there is practical value in going that far. Besides, exposing non-covariant Self as Never would definitely introduce a source compatibility impediment once we start considering path-dependent types. It is possible to deduce Self internally in the event of a single conformance, but this kind of inference may leave unintended type-erasure unnoticed until the next conformance (also, having a non-public protocol around for the sake of a single conformance is something you should most likely avoid in production code).

Paul_Cantrell:

However, if I understand correctly, the following code still won’t work even if this proposal is accepted:
let testShapes: [Shape] = […]
for shape in testShapes {
    assertDuplicateMatches(shape)
}
…because the compiler can’t determine a non-existential SpecificShapeType for the call to assertDuplicateMatches . Do I have that right?

Given that, I take it this implies the Shape existential does not actually conform to the Shape protocol? The proposal doesn’t say this explicitly, but it must be so. (Given that, what is the error message here? Seems like one that requires extra-special care.)

Given the above, I take it that this future direction in the proposal:

Make existential types "self-conforming" by automatically opening them when passed as generic arguments to functions. Generic instantiations could have them opened as opaque types.

…would make the code above work as written?

Correct. The error message should be the good ol' P does not conform to P (the self-conformance issue).

filip-sakel · April 26, 2021, 11:33pm

Library authors can already "open" existentials, which can be helpful in the absence of Self-conformances, albeit not being very elegant:

let testShapes: [Shape] = […]
for shape in testShapes {
    let unboxedShape = _openExistential(shape, do: { $0 })
 
    assertDuplicateMatches(unboxedShape)
}

Joe_Groff · April 27, 2021, 12:58am

You do not need to use _openExistential to open most existentials. An extension method on the protocol will suffice. _openExistential is only strictly necessary for existentials on whose constraints extensions are not allowed, such as Any and AnyObject.

Paul_Cantrell · April 27, 2021, 1:22am

I wasn’t aware path-dependent types were on the table, even the far-future hypothetical table! It seemed from the responses to the perennial “optionals should be like Kotlin” question that Swift was steering clear of them for the foreseeable future. Interesting.

Yeah, the more I think this through, the more I realize there’s not a concern here. The situations where you can infer a single specific type without actually having type constraints that narrow to that type just…aren’t useful. Thanks for humoring me!

Thanks, good to know. Again, thanks for humoring me.

I do worry about the usability of all these type system features in practice (while still supporting them, to be clear). Messages like P does not conform to P need to make way for more approachable diagnostics, or Swift will become too hostile an environment for people who aren’t PL geeks. Wasn’t there a lovely project a while back that demonstrated type errors with specific examples? That would help immensely here. And as always, I wish for tooling that makes far more robust use of static type info than just type errors and autocompletions, some sort of visualization / contextual annotation / something that makes these kinds of problem apparent before the error message even shows up. That and a flying carpet.

Paul_Cantrell · April 27, 2021, 1:41am

Yes! I almost included similar code in my OP, but decided it muddied my question too much. At least it is possible, if awkward.

Is there an approach as general _openExistential that doesn’t require writing one extension method per method you want to forward to the opened existential? If so, I’d be curious to see it!

anthonylatsis · April 27, 2021, 8:33pm

Your mention of Kotlin just made me realize that "path-dependent" might sound misleading to some. What we really mean is a notion similar to the type identity in opaque types rather than "smart casts":

Unlock Existential Types for All Protocols

protocol P {
  associatedtype B: Class
  var b: B { get }
  func takesB(_: B)
}

func takesP(p1: P, p2: P) {
 let b1 = p1.b 
 let b2 = p2.b
 p1.takesB(b1) // ok
 p2.takesB(b1) // error: b1 may be of the wrong type
 p2.takesB(b2) // ok
 p1.takesB(b2) // error: b2 may be of the wrong type

}

Joe_Groff · April 27, 2021, 8:38pm

As @anthonylatsis, one approach would be to implicitly open an immutable existential binding's dynamic type when evaluating it, giving the effect of path-dependent types. That would mean that you could invoke a method on an existential let binding that produces a value of one of its associated types, and then re-apply that result to another method on the same existential binding, because we know the dynamic associated type will still match the type expected by the existential.

Another more explicit syntax might be to allow you to specify a new type variable to bind to the dynamic type in addition to the value, like:

func takesP(p: P) {
  let <T: P> x: T = p // now T refers to the dynamic type of p, and x has type T
  let b = x.b
  x.takesB(b)
}

Paul_Cantrell · April 27, 2021, 8:51pm

Ahhh. Yes, I’d completely misunderstood! So here, there two types, “p1’s B” and “p2’s B,” which are implicit and may not even have names one can spell out explicitly in the code, but still exist as the single, unchanging static types of b1 and b2. Thus “path” meaning “of value propagation through expressions,” not “of control flow.” Thanks for the clarification.

1oo7 · April 28, 2021, 3:22pm

Thank you for this proposal!

Please let me address the question of explicit syntax for existentials. Your proposal states:

So far, existentials are the only built-in abstraction model (on par with generics and opaque types) that doesn't have its own idiosyncratic syntax; they are spelled as the bare protocol name or a composition thereof.

The Swift style guide says that protocols should have a name that makes it clearly a protocol—an adjective or gerundive describing the capability or behavior represented by the protocol requirement: Hashable, Comparable, etc. This is not compiler-enforced syntax, but there is seldom any confusion about what is a protocol and what isn't.

the syntax strongly suggets that the protocol as a type and the protocol as a constraint are one thing

Existing Swift gives constraints an explicit syntax; it's clear they are not the same thing as using a protocol as a type. Consider:

protocol Processable {}
protocol Usable {}
func process<T: Processable>(_ item: T) -> Usable

Processable is being used as a constraint, and Usable is not. It's not ambiguous.

If we change this to:

func process<T: Processable>(_ item: T) -> any Usable

That's more ambiguous, not less, because now it makes sound like Usable is a constraint on any type that gets returned by the function, just like View is a constraint on some View that gets returned by a SwiftUI body. However this function won't return just "any" concrete type that conforms to Usable, it will always return the same exact type: Usable (the existential).

in practice, they serve different purposes, and this manifests most confusingly in the "Protocol (the type) does not conform to Protocol (the constraint)" paradox.

I agree it's confusingly worded, but it's not a paradox. There are other ways to explain this, where it makes perfect sense.

This could be qualified as a missing feature in the language; the bottom line is that the syntax is tempting developers when it should be contributing to weighted decisions.

I don't think syntax "tempts" people. It's just syntax. We should avoid the temptation to use syntax to prescribe behaviors.

If we are seeing people avoiding using certain language features, or being confused by them, I think it means we could do a better job of documenting the language & educating Swift developers about how the compiler and runtime work. I'd note that @xwu and their team has done a really excellent job of this lately, and as well there's a lot more resources available for learning the intricacies of Swift than there were a few years back.

Nonetheless, many Swift devs are still operating from assumptions we got from the "protocol-oriented programming" talk at WWDC five+ years ago, a talk which hardly anyone fully understood (at the time). Most of us walked away from it thinking "use structs and protocols more" but all the talk about static vs. dynamic dispatch sailed right over our heads.

I'm not saying the syntax can't be improved, but we should be careful about why we're doing it lest we make the problem actually worse and end up with a burdensome, overbearing language.

We should try to come up with a simpler metaphor to better express the concept of a "protocol-type object" and ultimately the dynamic vs. static dichotomy that is the elephant in the room.

Because using existential types is syntactically lightweight in comparison to using other abstractions, and similar to using a base class, the more possibilities they offer, the more users are vulnerable to unintended or inappropriate type erasure by following the path of initial least resistance.

I don't think this is why people are avoiding using generics and PATs. It has nothing to do with some code being easier to type.

The reason some people stick to using existentials is because when they tried to write code using PATs, they ran into compiler errors that were not straightforward to resolve, and then when they tried to use generics, they couldn't because you can't pass an existential into a generic parameter (unless it's an @objc existential without static requirements, a little-known exception to that rule although as of last summer it's in the official documentation).

By removing many of those errors, your proposal will unblock more people from using these features—no syntax change needed.

Now, if the decision to change the existing syntax in a breaking way is made, I hope we can do it in a way that helps clarify the language and make it easier to have conversations about it. I might suggest:

the new syntax should be a single word that could be used interchangeably with "existential" or "protocol-type object" in a sentence, while not sounding confusing
the new syntax could ideally make it easier for Swift devs to recognize when static vs. dynamic type information is relied on
the new syntax should make it unambiguous when reference semantics are likely incurred

Some examples I could think of:

var foo: Proto<Fooable>
var foo: Object<Fooable>
var foo: dynamic Fooable

I like Object<Fooable> because it makes it obvious that protocol-type objects will incur reference semantics just like any other object (yes there are some exceptions but they're basically just a compiler optimization when the object is tiny). It also makes it obvious that we're dealing with a concrete type that allows no further generic specialization, and from which no further static type information can be "opened" without dynamic casting or type erasure schemes.

The problem with "dynamic" is that it's too overloaded. Swift already has the ~~attribute~~ declaration modifier dynamic, which has an explicit definition as "dynamically dispatched using the Objective-C runtime". However @nonobjc existentials can still be dynamically dispatched, and the attributes @dynamicMemberLookup and @dynamicCallable don't seem to be directly related to the Obj. C runtime. So what does "dynamic" really mean? I find it confusing; clarifying this would be great but not sure if it's worth breaking syntax.

Hope this could be helpful to your proposal, and thanks again, really looking forwards to this being in the language. Even if it does still have some limitations, I think it reduces the number of limitations we have to deal with and will encourage people to revisit their attitudes towards PATs. Thanks!

filip-sakel · April 28, 2021, 6:41pm

OK.

1oo7:

Existing Swift gives constraints an explicit syntax; it's clear they are not the same thing as using a protocol as a type. Consider:
protocol Processable {}
protocol Usable {}
func process<T: Processable>(_ item: T) -> Usable 
Processable is being used as a constraint, and Usable is not. It's not ambiguous.

Many Swift users don't know the difference between func f<T: P>(parameter: T) and func f(parameter: P) — I too was one of those people at some point. The confusion stems from the lack of an indication that parameter is bound to the existential type of P; any P should hopefully change that.

Do you mean that users could perhaps conflate () -> any Usable and <^T: Usable>() -> T — the latter syntax is borrowed from here?

To me, the any-some distinction, between existentials and generics respectively, makes sense. Namely, some Hashable indicates that a value of some (a singular, specific) Hashable-conforming type will be returned. On the contrary, any Hashable indicates that any (an unspecified type, not always the same) Hashable-conforming type can be returned (and wrapped in the existential), meaning we can do this: (Bool.random() ? "" ? 0) as any Hashable but can't do the same with some.

That was metaphorical, but unfortunate wording on our part, nonetheless.

I disagree. Syntactic friction can encourage certain styles and discourage others; for example, requiring that existentials be written out as ThisIsAnExistentialOfProtocol<Hashable> would result in many preferring <T: Hashable> T where possible, choosing the path of least resistance.

Documentation can help but it's not enough.

To be clear, we don't propose altering the existential syntax; let a: any Hashable will be invalid.

Swift tends to avoid abbreviations.

I think these would be poor name choices.

The only benefit I see to Object<Protocol> is the fact that it's a concrete type; however, that would be undermined if we enabled func f(_: Optional<some Hashable>). Nevertheless, I don't think naming should be based on implementation details (existentials don't have reference semantics — mutating one doesn't change another), not to mention that many users would think that Object is referring solely to classes. (If we use this syntax, then Array which isn't backed by value types, should be called ArrayObject.)

JoeyKL · April 28, 2021, 6:45pm

I think having a particular syntax for existentials of PATs (such as any P) is a very good idea. Not being able to equate two instances of Equatable would be very confusing. Having the type name bare makes PATs look just like superclasses despite working completely different.

Though, if we were going to introduce a new syntax for existentials, we think it'd be much less confusing if we took the potentially source-breaking path and did so uniformly, deprecating the existing syntax after a late-enough language version, than to have yet another attribute and two syntaxes where one only works some of the time. We also believe that drawing a tangible line between protocols that "do" and "do not" have limited access to their API is ill-advised due to the relative nature of this phenomenon.

I think this is a bad idea; we currently have two very different kinds of protocols, and I think most of the confusion is caused by the fact that they look the same. Instead of trying to make two different things work the same, we should just make the distinction clearer.

filip-sakel · April 28, 2021, 6:58pm

Although not currently proposed, I agree with you and others upthread that requiring, or at least offering, the any syntax for the newly unlocked existential types will help us transition away from the current, in my opinion, unclear syntax.

anthonylatsis · April 28, 2021, 7:32pm

1oo7:

Existing Swift gives constraints an explicit syntax; it's clear they are not the same thing as using a protocol as a type. Consider:
protocol Processable {}
protocol Usable {}
func process<T: Processable>(_ item: T) -> Usable 
Processable is being used as a constraint, and Usable is not. It's not ambiguous.

The primary motivation for reconsidering the syntax is that an existential is not a protocol. The problem isn't that people can't syntactically tell apart a conformance constraint and a protocol as a value type (clearly they can), but that Proto in T : Proto and Proto in let foo: Proto are different yet equally spelled things.

1oo7 · April 28, 2021, 10:09pm

The problem with "any" is that you can already return an existential using the some keyword, so it doesn't make any sense to have any keyword as opposed to the some keyword.

The following compiles and runs:

import Foundation
@objc protocol Zish {}

class Z1: Zish {}
class Z2: Zish {}

func makeZ1() -> some Zish { Z1() }
func makeZ2() -> some Zish { Z2() as Zish } // a "some" existential! 

let z1 = makeZ1()
assert(type(of: z1) == Z1.self) // passes
let z2 = makeZ2()
assert(type(of: z2) == Z2.self) // fails; type is "Zish" (the existential)

My point is, we should try to avoid baking into Swift any confusion between dynamically-known types and statically-known types. An existential is just a compiler-generated wrapper type, nothing more and nothing less. An existential is not an abstraction, it's a concretion:

some keyword means "a specific, statically known type that conforms to this protocol" but meanwhile
any keyword would also means "a specific, statically known type that conforms to this protocol"... which at runtime happens to wrap an instance of a dynamically-known type that also conforms to that protocol.

I believe this would be incredibly confusing and self-contradictory, because how does the compiler know how many layers of this kind of wrapping there will be? I don't think we should bake into the language the illusion that something else is happening.

Please correct me if I'm wrong about how you're proposing to use this keyword, because if your proposal was also introducing the ability to statically "open" an existential and keep the type information through the process, then I might feel persuaded... but I can't see a way around the fact that an existential can be returned by a "some" function.

Unless I misread it, in your proposal the following assertion would fail and the function would behave identically to the makeZ2() that I have shown above:

func makeZ2() -> any Zish 
{ Z2() as Zish } 
let z2 = makeZ2()
assert(type(of: z2) == Z2.self) // this would still fail

If I'm wrong then I will stand corrected. And actually I would prefer to be wrong on this, so please let me know :D

If I'm not wrong, then I feel that adding this keyword makes it sound like we now have this static level of support but really we don't, which would (IMHO) irrevocably muddy the waters of dynamic/static at a moment when, what we really need, is to clarify the waters.

If we're going to change the syntax, lets do so in a way that makes it clearer to people what is actually happening from the same, static perspective by making the existential not auto-synthesized, and having Object<MyProtocol> or Proto<MyProtocol>. That way when we're in a code review session we can say, "Maybe you should use a proto here instead of a generic."

But under the proposed "any" what will you say in a code review session? "Maybe you should use an any here?" This will make communication about code ideas even more annoying than currently when we say "existential" and and people think we're invoking Nietzsche.

struct and Int disagree.

Syntactic friction can encourage certain styles and discourage others; for example, requiring that existentials be written out as ThisIsAnExistentialOfProtocol<Hashable> would result in many preferring <T: Hashable> T where possible, choosing the path of least resistance.

Well, I respect that it's your opinion, but I think you would need to provide rigorous psychological studies on coding behaviors to convince us whether this would really have the behavioral effect you believe it would.

In the absence of empirical data, my opinion is that the language's syntax should not be based on a strategy of using negative emotions to motivate people, and we should avoid breaking changes to Swift code if at all possible.

1oo7 · April 28, 2021, 10:29pm

Not necessarily.

import Foundation

@objc protocol Proto {}
class MyProto: Proto {}

func process<P: Proto>(_ p: P) {
    print("process")
}

let P: Proto = MyProto()
process(P)

This compiles and runs fine. It compiles fine if Proto is a class and we let P: Proto = Proto().

I would argue that we already have syntax coloring to disambiguate generic parameters from declarations. We don't need more keywords.

I'm not going to disagree protocols are confusing as hell in Swift, but clarifying them to me would look more like this:

@objc LimitedFlexibleFoo {
    var x: int
}

// guaranteed to never add static/init or self/associatedtype requirements 
// in future updates (like a frozen enum, sorta)
@nonstatic @nongeneric protocol FlexibleFoo { 
    var x: Int
}

// guaranteed to never add self/associatedtype requirements
@nongeneric protocol RegularFoo { 
    var x: Int
    init() 
}

// guaranteed to never add static/init requirements
@nonstatic protocol FlexiblePATFoo { 
    associatedtype X 
    var x: X
}

// all bets are off
protocol InflexiblePATFoo { 
    associatedtype X
    var x: X
    init()
}

Maybe this could be simplified by adding a @frozen attribute that guarantees that whatever the protocol requires will never change, but I would prefer more of an explicit version. Nice thing is, this is a purely additive non-source-breaking change because if you don't opt-in, your protocols get treated with the same prejudice as always.

anthonylatsis · April 28, 2021, 10:41pm

1oo7:

Not necessarily.

import Foundation

@objc protocol Proto {}
class MyProto: Proto {}

func process<P: Proto>(_ p: P) {
    print("process")
}

let P: Proto = MyProto()
process(P)

This compiles and runs fine.

Here, Proto in P: Proto is still a protocol, and Proto in let P: Proto is still an existential (that happens to conform to the protocol). So-called "self-conformance" is orthogonal to the difference between a protocol and an existential.

David_Catmull · April 29, 2021, 3:03pm

I think there is a key difference, though. With some P, you are required to return the same type in all cases, it's just obscured from the outside. With a regular existential, the actual type of the returned object can vary. I think the words "any" and "some" convey that difference, too. "Any" is arbitrary; "some" is more specific.

sighoya · April 29, 2021, 4:43pm

That's not entirely true:

func make<T>(t:T) -> some Any { t } 


let i = make(t:1)
print(i is Int) //prints true

let j = make(t:1.0)
print(j is Float64) //prints true

Just to say, for me, a distinction between dynamic Protocol and static Protocol would make more sense at a technical level, although not all statically inferable things are inferred by the compiler.
But we have already some Protocol, so any Protocol would make it more clearly that we provide duality, here.
Further, some clearly indicates that the exact type shouldn't be important or viewable for the user with some exceptions, of course, while static Protocol would technically allow for this.
Moreover, some Protocols seem to fix the type behind the protocol, i.e. you can't exchange the type even when the new type is statically known while I think that a static Protocol shouldn't forbid to exchange types statically.

David_Catmull · April 29, 2021, 4:50pm

It still holds. make<Int>() always returns an Int, and make<Float>() always returns a Float.

suyashsrijan · April 29, 2021, 5:28pm

I think this is a little different since its returning a value of the same type every time you call it (i.e. make(t: 1) -> some Any always returns an Int). If there was conditional logic for example to dynamically return a different type, you'll see that it is not allowed. For example, if you had code like this:

protocol P {}
struct S: P {}
struct T: P {}

// error: function declares an opaque return type, but the return statements in its body do not have matching underlying types
func someP() -> some P {
  if Bool.random() {
    return S() // note: return statement has underlying type 'S'
  } else {
    return T() // note: return statement has underlying type 'T'
  }
}

If the method was returning an existential, the same logic would compile fine:

protocol P {}
struct S: P {}
struct T: P {}

func anyP() -> /*any*/ P {
  if Bool.random() {
    return S() // ok
  } else {
    return T() // ok
  }
}

let p: P = anyP() // could be S or T at runtime

TSPL talks about the differences in more detail under the Differences Between Opaque Types and Protocol Types section:

Returning an opaque type looks very similar to using a protocol type as the return type of a function, but these two kinds of return type differ in whether they preserve type identity. An opaque type refers to one specific type, although the caller of the function isn’t able to see which type; a protocol type can refer to any type that conforms to the protocol. Generally speaking, protocol types give you more flexibility about the underlying types of the values they store, and opaque types let you make stronger guarantees about those underlying types.