Introduce Any<P> as a better way of writing existentials

Maybe we could use something like:

var anyAnimal: existential<Animal>

I don't think we could use "Any" because using a identifier for a non-generic type and a generic type confuses the parser. I wouldn't object to something shorter than "existential." I want the brackets so more complex expressions for the protocol name won't spew punctuator soup into the rest of the source file unconstrained. Besides protocol compositions, we can use this for allowing existentials for types with associated types:

var literal: existential<ExpressibleByUnicodeScalarLiteral where UnicodeScalarLiteralType == Character>

(Note: all of the associated types have to be locked down, and there can't be any Self requirements.)

Could you explain how a different spelling, be it any Animal or Any<Animal> instead of Animal resolves this issue? I think the problem is fundamentally "this error is possibly stemming from a conceptual gap, and the concept is a difficult one to grasp, which makes it almost impossible to explain the problem in the limited confines of a diagnostic message". Also, this diagnostic was recently updated on master to be:

error: protocol 'Animal' as a type cannot conform to the protocol itself; only concrete types such as structs, enums and classes can conform to protocols

Not saying that this is perfect, but it does seem like an improvement.

We also have an educational note explaining this in more detail.

I think the problem is that many users donā€™t currently think of Animal as an existential, they think of it as a supertype. And why shouldnā€™t they? Thereā€™s a clear analogy between

protocol Animal { func noise() }
class Dog: Animal { func noise() { print("Woof") } }

and

class Animal { func noise() {} }
class Dog: Animal { override func noise() { print("Woof") } }

Overall I think this is a strength, not a weakness. Protocols look like supertypes and usually act like supertypes, so most of the time they should just be supertypes.

In the cases where they donā€™t act like supertypes, really they arenā€™t conceptually types at all, and a new syntax (Any<P>) would make this clearer.

4 Likes

I like the proposal, as I think that thereā€™s great value to differentiate the existential from the protocol in how itā€™s spelled. I prefer any though.
Also, but itā€™s a separate proposal imho, existentials that can conform to their protocols should do so automatically (and the error message when they donā€™t should specify which are those restrictions)

1 Like

I'm overall in favor of segregating protocols from their existential types as proposed above. However I think some alterations would be integral. Thus, I am going to propose some more changes, which - to be clear - I do think are very hard to implement in the real world.

So what do I propose?

I think the proposed way to refer to a protocol existential types (by parameterizing Any) is quite good. The syntax is clear and easy to use - yet not too lightweight - so that people don't mindlessly resort to Existentials in cases where Generics would be better. However, this syntax also poses a lot of problems and has drawn criticism over its odd parameterization behavior:

let foo: Array<Int> 
    // Generics Syntax

let bar: Any<Error> 
    // Existentials Syntax

let baz: Any 
    // What happened here?
    // `Array<Element = Int>`
    // is currently invalid,
    // so what's this

So to tackle this problem, I propose that we add a supertype "Value" for what is currently "Any"- thus improving the parameterization behavior of the proposal. This way, we would separate the hypothetical supertype Value from the existential-related Any. Furthermore, potential future syntax - as discussed in Improving the UI of Generics - would be improved :

func foo(bar: some Value) {}

// equivalent to:

func foo<Value>(bar: Value) {}

Moreover, a clearer distinction between Generics, Existentials and the super-type Value would be established:

protocol Value {}
// Every type implicitly 
// conforms to `Value`.


let foo: Value
// Since `Value` would
// essentially be a protocol
// we cannot bound `foo` to 
// the `Value` itself, but rather
// its Existential.
āŒ Cannot use `Value` as a type.
Did you mean `Any<Value>`?

Also, many have expressed their concerns about how parameterization would interact with other non-protocol types. Personally, I see that not as a constraint, but rather as a push towards generalized existentials which would support protocols as well structs/classes/enums:

let foo: Any<Array>

let foo: Any<Int>
āš ļø `Any<Int>` is equivalent to `Int`.

let foo: Any<Any<Array>>
// Since `Any<Int>` is a 
// concrete type, its 
// Existential would not serve
// any purpose. It's like the 
// above example.
āš ļø `Any<Any<Array>>` is equivalent to `Any<Array>`.

Why Not Use an "any" Modifier?

In my opinion, an "any" modifier (written as such: any Error) would be too lightweight to use for Existentials. That's because despite being great for certain situations, Generics are often a better choice. I'm concerned that by adopting the "lightweight" syntax many beginners would be unable to differentiate between the two and, as a result, instinctively choose the former over the latter. Not to mention, that Existential Types are just that: Types. That is, despite their boxing behavior and syntactic magic they'll still largely behave as Types in future language versions (where they'd be extensible).

I agree with this.

func takeASpecificAnimal<T: Animal>(_ animal: T) {}

To me, the most straightforward solution here is to allow Animal to be a concrete type conforming to itself when this wouldn't cause any problems, i.e. when it doesn't have static, init, Self, or associatedtype requirements.

I'm wondering if it would work to require the some modifier to get access to static/init/Self parts of a protocol like this:

func takeASpecificAnimal<T: some Animal>(_ animal: T) {
    // access to T.someStaticMethod()
}

func takeASpecificAnimal<T: Animal>(_ animal: T) {
    // no access to T.someStaticMethod()
}

Existentials could then be used to specialize all generic parameters that don't have the some modifier.

I would guess that requiring <T: some Animal> to access all the properties of T would be far too source-breaking to consider. My main goal in this pitch is to make it more clear to the user when they are using existentials; right now, they're mostly invisible. I'm less interested in pushing for a large-scale change to generics that's unlikely to go anywhere.

2 Likes

Just one more thing that came to my mind. What is the exact motivation of this proposal? If itā€˜s only a distinction between types and better (error) logs then I personally donā€˜t think that the high bar of such massive disruption would be met. We have to preserve source compatibility so var v: P should remain to work, but I also think and hope at some point we want to force the type to be more explicit, which would make it var v: any P.

Will the ability of extending existentials only be implied from this proposal?

I personally think that we would have to add at least that functionality to get somehow close to the high bar. In other words, what does the pitched solution enable the user to do? If the user canā€˜t create distinct extensions, then there is nothing added other than syntax disruption:

// extends the functionality of P
extension P {}

// extends only the existential
extension any P {}

// POTENTIALLY IN THE FURURE:
// extends only all conforming types of P
extension some P {}

And as Iā€˜m writing this, we could finally manually express existential conformance to their protocols.

extension any P: P {}

If these are only long term goals, then I personally donā€˜t see this proposal to make through the review as it would only harm the user as the explicit distinction didnā€˜t also enable new features.

5 Likes

The motivation of this proposal is that I find myself frequently having to explain the difference between protocols and existentials, as well as how they interact with generics. I see it frequently in my conversations with local developers, and it also comes up here on the forums fairly often. It's difficult to even talk about the difference between a protocol and an existential when they have the exact same spelling, so it's not surprising that users don't understand it.

I'm not proposing adding existential extensions right now. Joe Groff's "Improving the UI of generics" post from last year anticipates the ability to extend the existential type itself, using the spelling extension any P. I find myself wishing that we had the any P notation now (even before we have the ability to extend existentials), simply for the increased clarity it provides. It takes a confusing part of the language and makes it less confusing. I believe that has merits on its own.

However, for reasons I've tried to enumerate in the original post, I think the any P spelling is confusing. My main objection is that the natural reading of the following suggests that it extends any conforming type:

extension any Animal {
	func speak() -> String { ... }
}

If that is truly an extension of any Animal as is the obvious reading, then it seems natural that you should be able to call speak() on any animal. And yet, I cannot call dog.speak() or cat.speak(); I can only call anyAnimal.speak(). So the anticipated syntax extension any Animal seems actively harmful to me; if people are confused about existentials now, I worry that using that proposed syntax will make them only more confused in the future. extension Any<P> suggests that it clearly extends only a single type, not all types.

I'm also concerned that it's easy to get lost in the soup with a declaration like this:

// This is difficult to follow (for me)
var x: any P & Q where P.X == Q.Z = SomeStruct()

// This is less confusing to me
var x: Any<P & Q where P.X == Q.Z> = SomeStruct()

The <> delimiters help my eye recognize where the type declaration ends and the initializer expression begins. Again, though, we don't have the ability to write such expressions at all right now, so this is not an immediate concern. I just worry that if we go down the any P road, it will lead to confusion.

I've used the explanation that existentials are basically an Any<P> multiple times when answering questions from fellow developers, and have found it to be consistently useful in helping people understand what's happening. My hope is that we can get the clarity of a better spelling now, and I've tried to design it in a way that makes it compatible with future features.

So in short, my motivation is to address a source of active confusion: developers not understanding that var v: P introduces a new existential type. I also hope that I can nudge the syntax toward what I feel is a better spelling (Any<P> instead of any P), but if the consensus is that any P is the better spelling, I can accept that. Whatever spelling we choose, I hope we can get the benefit of a clearer spelling without having to wait for additional features.

Because of the need for source compatibility, I don't think we can force everyone to convert var v: P to var v: any P anytime in the near future. There's so much existing code out there using existentials that I'm not sure we'd ever be able to do it. I know I wouldn't want to go through and convert all the code in my own code base. But I'd still like the benefit of clarity for new code.

10 Likes

Personally, I'd rather have this:

var x: (P & Q where P.X == Q.Z) = SomeStruct()

In short: just add a where clause to the existing syntax. I'm not thrilled at the idea adding an alternate syntax (any P or Any<P>) to express something that already has a syntax in the language.

4 Likes

One of the things I like about any P is that it could potentially be extended in the future to give a name to the "opened" existential:

let x: any Animal A = ...
///We can now use A to refer to the exact type opened...

In general, I like the idea of having some kind of brackets in the syntax for the existentials. Even when using any P syntax, I tend to put it in parentheses: extension (any P), (any P).Type, var x: (any P & Q where P.X == Q.Z) = SomeStruct().

But also I think we should try to make syntax visually distinct from generic types. Using any other kind of brackets or making any lowercase, would do the job:

  • Any(P) Looks like type cast
  • Any[P] Looks like static subscript
  • Any{P} Confusable with trailing closure?
  • any<P> OK
  • any(P) Looks like function call
  • any[P] Looks like subscript
  • any{P} Also trailing closure?
  • <Any P>
  • (Any P)
  • [Any P]
  • {Any P}
  • <any P>
  • (any P)
  • [any P]
  • {any P}

What do you think?

I agree that putting parentheses around the type does clarify it quite a bit. I don't think I would require the parentheses, though, as we don't require them on types anywhere else. As for your other suggestions:

  • [Any P] and [any P] look like shorthand for Array<any P>; probably unworkable.

  • {Any P} and {any P}: Swift usually reserves {} for code blocks, so I don't think this would pass review.

  • <Any P> and <any P>: interesting idea. I think there's some motion toward using a syntax like the following for opening an existential, though; if so, it would probably be incompatible:

    let specificAnimal: <T: Animal> = anyAnimal
    // 'T' is now the type of the specific animal.
    

    I also think it may be rather noisy in a function call:

    func takeTwoAnimals(animal1: <any Animal>, animal2: <any Animal>) {}
    

    but maybe this is okay? I don't love this option, but I don't entirely dislike it either.

It's interesting to me that any<P> is considered OK, but Any<P> isn't. Any is already the spelling for an unconstrained existential, so if generic-ish syntax is okay for any<P>, I don't see why Any<P> would not be strictly better.

1 Like

Optional parentheses that become required when necessary to disambiguate would be the choice most consistent with prior art in Swift, since they're already used that way in type constructions. For example, if you have protocols P and Q, then all of the following are valid:

struct S: P & Q {}
struct S: (P & Q) {}

let x: P & Q
let x: (P & Q)

But if you want to reference the metatype of the intersection, you have to use parentheses to make it clear that you're accessing the whole thing:

(P & Q).self  // ok
P & Q.self    // ERROR

(IIRC the compiler represents these slightly differently because the parenthesized ones are wrapped in ParenType, but they get canonicalized before it matters to the user for the most part. You can say let x: (Int) if you want and it's the same as let x: Int, except in some weird cases like REPL rendering, as I just found out.)

So if folks want any as a space-delimited prefix for existential type expressions, I think it makes the most sense to let parentheses be used the same way there.


Aside: If that happens, it would be nice for opaque types to get the same consistency. (some P) is not allowed, my guess being due to the limitations on where opaque types may occur and (some P) being treated as an opaque type inside a ParenType instead of it being the direct return type of a function:

func foo() -> some P    // ok
func bar() -> (some P)  // error: 'some' types are only implemented
                        // for the declared type of properties and subscripts
                        // and the return type of functions
6 Likes

One of the confusing aspects of Swift currently is indeed things that look at same but are conceptually different. Echoing what @xwu said further up the thread, we shouldn't solve this in one area (protocols as constraints vs existential types) just to add confusion in another area (existential vs generic types). As mentioned directly above, optional parentheses does look like the prior art for this type of grouping of type expressions.

However I do agree that something like extension any Animal is confusing and counter intuitive. Maybe any is not the correct key word here but I don't think that is justification for abandoning the overall expression form for something that is confusing in another way.

Itā€™s a little more verbose but the spelling existential Animal definitely has more clarity.

extension existential Animal {
	func speak() -> String { ... }
}

Awkward alliteration aside, itā€™s much more clear what type is receiving the speak() functionality here.

1 Like

Only if you're familiar with the term existential in relation to the type system. Most users won't be. A more common word like any, while less precise, will be much more understandable.

7 Likes

A fair point, and I do prefer the any P spelling in every case except an extension to the existential, so it may just be a necessary concession. Does the compiler have an option to generate informational messages that arenā€™t warnings? Or, I suppose, is it possible that the error could be explicit when a user tries to use functionality on an arbitrary P that was extended to any P to explain why that doesnā€™t work.

I've tried to say this before, but hopefully this post is clearer.

I definitely support reconsidering the syntax of existential types, but I would not like to make any incremental changes to existentials without holistically considering where we'll end up. In particular, there are at least three possibilities we should consider that will affect our syntax choices:

  1. ā€œPartialā€ protocol types, i.e. existentials whose protocol declares requirements not available on an instance of the existential type. It seems likely that the current ā€œassociated type or self parameterā€ restriction on existentials will be lifted at some point, and when that happens, we'll be faced with situations like this:

    protocol Equatable { 
      static func == (Self, Self) -> Bool
    }
    
    func test(x: Equatable, y: Equatable) -> Bool {
      x == y // ERROR!
    }
    

    where some part of the API of the declared API of the protocol in question (in this case, the whole API!) is completely unavailable on instances of that protocol type. I don't think it's possible to overstate how weird it is going to be for ordinary users that they can declare a type like Equatable and then not be able to use the declared API at all. And in fact I think it will be more damaging when only a small fraction of the API is missing on the existential because users who don't understand generics yet will happily code their way down the ā€œeasyā€ path of using existentials until they find themselves blocked by partiality, when it would have been more appropriate to use the protocol as a generic constraint.

  2. Constrained existentials, e.g. a Collection whose Elements have type Int. It seems obvious to me that we're going to get this feature someday, and that it will involve a where clause, e.g.

    func first(x: Collection where Element == Int) -> Int? { x.first }
    
  3. Existentials that conform to their corresponding protocol. Today, no protocol type (existential) is self-conforming, but it would be very useful (and avoid a tedious forwarding layer in many cases) if a protocol could be declared to be self-conforming, e.g.,

    public protocol Drawable: Self { func draw() }
    

    The explicit statement ā€œ: Selfā€ is important, because self-conformance is a guarantee to clientsā€”it determines whether they can use the existential as a generic parameter constrained to that protocolā€”and adding certain kinds of requirements (init and static members, and anything that makes the protocol ā€œpartialā€ā€”see 1. above) necessarily make self-conformance impossible. We wouldn't want maintainers of a self-conforming protocol to inadvertently break that guarantee.

I have the following goals for a new syntax:

  • When naming an existential type, possible partiality should be evident. For me, ā€œAny<Equatable>ā€ doesn't meet that bar.
  • The syntax should accommodate constraints without being overly verbose. Adding Any<ā€¦> doesn't automatically lead us to an obvious place to add constraints, and constraints are already going to add where.

Therefore, I propose we chart a path to this future state:

  • Self-conforming existentials are explicitly declared so, per Drawable above. Since they are never partial, you can name them without a ā€œwhereā€ clause.
  • Partial existentials are spelled with a where clause. An unconstrained partial existential uses the empty where condition, ā€œ_ā€. For me, ā€œEquatable where _ā€ clearly indicates that some part of the declared API may be missing because some constraints may be missing.

If we made it an error to use a partial existential without a where clause, we'd need syntax for explicitly declaring non-self-conforming-but-non-partial protocols, so protocol authors could avoid inadvertently breaking client source by adding an init or static requirement. Therefore, IMO using a partial existential type without a ā€œwhereā€ clause should generate a warning and a fixit: ā€partial protocol type should be spelled with an empty where clause; do you want to add it?ā€ That's a nice conclusion because it's probably the same warning we'd want as part of transitioning to the new scheme. I also like that it introduces the searchable term "partial protocol type" that we can clearly define.

Given all this, I think the near-term steps are:

  1. Add the syntax P where _ as a synonym for the existential type P.
  2. At some point appropriate to the release process, add the warning/fixit described above.
9 Likes

Do you have any thoughts about how this would apply to @objc protocols? Would they always be considered self-conforming? Otherwise, itā€™s going to warn on tons of ā€œdelegateā€ and ā€œdata-sourceā€ protocol declarations in existing code.