Introduce Any<P> as a better way of writing existentials

Summary

The type of a Swift existential is currently written as simply the name of the protocol. There have been a few suggestions of better ways to spell the name of an existential; I propose that we should introduce Any<MyProtocol> as the preferred spelling of protocol existentials.

Specifically, I suggest that we should start accepting Any<MyProtocol> as a preferred way to declare existentials soon, even before we add any additional features to existentials. We would still support the current spelling for source compatibility reasons.

Background

We frequently receive questions from people confused about the limitations placed on "variables of protocol type". (We tend not to use the word "existential" in compiler error messages, but that's what they are.) I believe many of these questions come from an incorrect understanding of how existentials work, and that by clarifying their spelling, we can make the language easier to learn and easier to use. This proposed spelling also opens up some opportunities for future features.

An existential is a wrapper type which can hold any value conforming to the protocol. For example, var anyAnimal: Animal below is an existential.

protocol Animal {}
struct Cat: Animal {}
struct Dog: Animal {}

// A variable of type 'Animal' is an existential; it can hold *any* Animal
var anyAnimal: Animal
anyAnimal = Cat()
anyAnimal = Dog()

The existential wrapper box type is mostly invisible to the user. For example, type(of: anyAnimal) returns the contained in the box. However, the wrapper is still a separate type. This can be observed when interacting with generics:

func takeAnything<T>(_ x: T) {
	print(T.self)
}

let dog = Dog()
let anyAnimal: Animal = Dog()
takeAnything(dog) // prints "Dog"
takeAnything(anyAnimal) // prints "Animal"

This also comes up when users try to pass an existential box to a generic function constrained to that protocol:

func takeASpecificAnimal<T: Animal>(_ animal: T) {}

// error: value of protocol type 'Animal' cannot conform to 'Animal'; only struct/enum/class types can conform to protocols
takeASpecificAnimal(anyAnimal)

The problem here is that a "value of protocol type 'Animal'" (i.e. the Animal existential box type) is a separate type from Dog or Cat, and that separate type does not conform to Animal on its own. This function is asking for a specific kind of animal -- a concrete type, not an existential wrapper.

The Problem(s)

There are two related problems which I think we can address in the short-term:

  1. Many users do not have a clear understanding of the difference between protocols and existentials, and the language doesn't help them.
    Because Swift uses the same spelling for both the protocol and the existential wrapper, the language does not help users build a mental model that differentiates the two concepts.

    Developer experience with other may not be very helpful either. In Objective-C, for example, all object variables were basically of the same type. When you used a Dog *someDog variable or parameter, it was still just passed as an object, exactly the same way id<Animal> someDog would have been. There was no "existential box" type. Swift doesn't help users understand that there's a separate box type involved.

  2. The specific error message about "values of protocol type" when used with generics is not actionable to many users.
    Since many users don't understand that var anyAnimal: Animal is a wrapper type around an Animal, they may be confused by the error message. They may belive that since anyAnimal is currently a Dog, and Dog is a struct, the error message is incorrect.

    This is arguably just a specific instance of problem #1, but it's one that comes up frequently. We've gone through a few different iterations on this particular error message, but confusion persists.

Proposed solution

We already have a spelling for an existential type when no protocols are involved: Any. Most users seem to understand Any as a wrapper type that can hold any value. We should leverage that familiarity to help users understand existentials by preferring the spelling Any<Animal> for a constrainted existential. When the protocol is used as a constraint, it continues to just be the protocol name. When used as an existential, we prefer the Any<MyProtocol> spelling.

With this spelling, the above code would look like this:

protocol Animal {}
struct Cat: Animal {}
struct Dog: Animal {}

var anyAnimal: Any<Animal>
anyAnimal = Cat()
anyAnimal = Dog()

let dog = Dog()

func takeAnything<T>(_ x: T) {
	print(T.self)
}

takeAnything(dog) // prints "Dog"
takeAnything(anyAnimal) // prints "Any<Animal>"

We can also improve the error message in the generic case:

// error: value of type 'Any<Animal>' cannot conform to 'Animal'; only struct/enum/class types can conform to protocols

or even:

// error: 'Any<Animal>' does not conform to protocol 'Animal'

This makes it obvious that the existential wrapper type is a distinct type from the protocol itself, and is also distinct from Dog or Cat. Dogs and Cats can conform to a protocol, but Any<P> cannot.

Backward compatibility

We can't just break existing code; we'll need to continue to support the existing spelling. I think we should treat it as an shorthand, though; using a protocol P as a type would be an alias for Any<P>. Autocompletion should prefer Any<P> where possible.


What about `any P`

What about any P?

The Improving the UI of generics thread from last year contemplates using any P instead of Any<P> as I've suggested here. There's a nice symmetry between any P and some P as introduced in Swift 5.2's opaque return types. However, I think we need to choose between analogy with Any and analogy with some P. I think the analogy with Any is the better analogy. A few points:

  • We already have Any as an unconstrained existential wrapper. I anticipate fielding questions about the difference between Any and any in Swift, and I don't know that I'd have a satisfactory answer. They both introducea wrapper type capable of holding any conforming type.

  • An existential wrapper is a new type. The analogy with some P is weak here, because some P doesn't introduce a wrapper type, so it's not obvious that any P would do so. Any<P> looks much more like a distinct type, which helps users understand the fundamental concept that an existential is a separate type.

  • The generics manifesto also contemplates the ability to add methods to the existential type itself, using syntax like this:

    extension any Animal {
    	func rest() {}
    }
    

    To me, this syntax is quite confusing; the naive reading suggest that the extension methods defined here can be called on any Animal type, when in fact the exact opposite is true; they would not be available on Dog or Cat, but only on the existential type itself.

    On the contrary, using Any<Animal> seems to make it much more clear that the extension is only on the existential type itself.

    extension Any<Animal> {
    	func rest() {}
    }
    

    Of course, we don't support extensions on existentials at all right now, so maybe this is a moot point. But if this is a direction we plan on exploring in the future, I'd rather be set up for the one that seems (to me) more natural.


Anyway, this has gone on long enough. I think we could get a pretty good win by simply introducing a new preferred spelling of our existing existentials. I think it would be minimally disruptive, composes well with concepts for future expansion, and resolves a common point of confusion among users. I'd love to hear your thoughts.

27 Likes

I have thought about this same syntax before, but I was wondering if it could be used to have the compiler actually generate a type erasure wrapper so that people don’t have to keep writing their own boilerplate code all the time.

It doesn’t sound like that’s what you had in mind, but I wonder whether you have any thoughts on that.

2 Likes

I'm missing a lot of context, but this makes me wonder why an existential wrapper that can hold any value conforming to a protocol doesn't actually conform to the protocol itself. Is that an implementation limitation or is there a more principled reason?

Even given the difference between Any<Animal> and a specific type like Cat that conforms to Animal, would it not make sense you should be able to bind T to Any<Animal> here since any value contained in that box is guaranteed to conform to the Animal protocol? What's the alternative if all you have is a variable of type Any<Animal> (or Animal, in the current syntax)?

2 Likes

@hamishknight has an excellent answer on StackOverflow explaining this.

3 Likes

I wouldn’t personally go this way and by making a bit more clear one thing make understanding of ‘Any’ less clear.
What i would personally prefer is to make protocol types conform to themselfs if the dont have static, constructor or associatedtype requirements. And when developer will see that error message he will understand it straight away and the whole concept as well.

1 Like

I quickly scanned the proposal, sorry if I missed the answer for my question: wouldn‘t this create an impact on the compiler performance as we‘d now have Any and Any<T> as separate types?!

I personally would love to go the any T direction as it aligns nicely with some and also leaves potential door open to introduce a meta keyword for metatypes existentials.

You could always create typealias MyAny<T> = any T if needed.

14 Likes

Recall that Swift can infer generic parameters. For example, when implementing DoubleWidth<T>, I could refer to the type as DoubleWidth.

Adding Any<P> could be source-breaking because there are conceivably contexts where Any currently means Any but in the future would preferentially mean Any<P> for some relevant P. This would be very confusing.

It was for the same reason that we considered and rejected naming SIMD types Int<SIMD2> instead of SIMD2<Int>; in certain contexts, the former could be shortened to Int, which would be confusing for everyone involved.

Moreover, it is unclear to me how one would spell protocol compositions with this notation. We do not support operators inside angle brackets currently and I would imagine that parsing <P & Q> is significantly less straightforward than the status quo because we would have three different standard operators used together.

The plan of record is to explore the spelling any P: I see no reason to deviate from that plan. I agree with the stated motivation here, however.

11 Likes

I think it makes it more clear: Any is a type-erasing wrapper; Any<P> is a type-erasing wrapper that requires conformance to certain protocols. You could think of Any as the degenerate case Any<> where we're not requiring conformance to any protocols.

Can you clarify what you find to be less clear?

I don't think existentials should automatically conform to themselves, because that can introduce a source compatibility trap. If Any<Animal> conforms to Animal, then as soon as you add a static, constructor, or associatedtype requlrement, that automatic synthesis disappears. Even if the actual implementation of the conformance is automatic, I think it should still be manually declared (just as we do with Equatable, Hashable, and Codable). And that means we need a way to spell the conformance, which under this proposal would be extension Any<Animal> : Animal {}.

If we don't have a separate way to spell the existential type vs the protocol, then I fear that any future error messages will continue to be just as confusing as the current ones, despite our best efforts.

I'm hardly an expert on compiler performance, but I don't imagine that there would be a performance problem unless the compiler were automatically inferring Any to mean Any<P>. I don't think we should automatically infer Any to mean Any<P>. If you want Any<P>, you would have to specifically write that out. I agree that it would be confusing otherwise. Is there a reason generic type inference would be required here?

Protocol compositions would be spelled Any<P & Q>. We already support this kind of syntax:

protocol P {}
protocol Q {}
struct Foo: P, Q {}

let array: Array<P & Q> = [Foo()]

In fact, syntax like Any<P & Q where ...> is specifically mentioned as a possible future direction in the core team decisions that led us to P & Q in the first place.

I've seen meta P discussed for metatype existentials, but I'm not entirely clear on what that would mean. Would it mean something like Any<Animal.Type> or maybe Any<Animal.Protocol>? Or maybe Any<Animal>.Type? Or is it something else I'm not understanding?

I specifically addressed the comparisons with any P in the original post. I felt like the post was getting a bit overly long, though, so I collapsed it under a disclosure indicator. Did you see the arguments I made there? Specifically, I think the distinction between Any and any would be confusing, I think the analogy with some is misleading, and I think the possible future direction extension any P implies almost exactly the opposite of what it means. I'd love to hear your thoughts on how we address those concrens with any P.

2 Likes

Alternatively, the error text in the generic situation could be improved. It could explain that it is an existential and can not conform.

I did see your arguments and I disagree with them. I think the distinction between Any and Any would be more confusing than the distinction between Any and any. Any is not a synonym for a wrapper and has many magical properties. Angle brackets are used for generics, and I would not want something that is explicitly not generic (and in fact should be distinguished from generics) to be spelled as though it is.

The existential type any P is just as distinct from the protocol P as is the opaque type some P. The spelling is apt because neither are nominal types and both are ‘wrappers’ in some sense.

16 Likes

Is Any<P> a generic type or a special syntax?

If it is the later, I would prefer syntax of any P to avoid confusion with generics.

If Any<P> is a proper generic type from the standard library, how does it generic signature look like? Its generic argument cannot be a type, because Any<String> makes no sense. And when protocol P is used as a type, it effectively means Any<P>. So if P in Any<P> is a type, then Any<P> = Any<Any<P>> = Any<Any<Any<P>>> = ....

If P is not a type, than does that mean that we are introducing kinds in the Swift?

struct Array<Element: Type> {...}
struct Any<Proto: Protocol> {...}
2 Likes

The reason is that this is how the language currently behaves. Yes, it would be confusing for your proposed spelling of existential types, but I also would strongly object to treating angle brackets after Any differently from angle brackets after any other name.

My mantra here and elsewhere is that different things should be spelled differently and similar things should be spelled similarly. If the rules surrounding generic parameters are not a good fit for existential types, the solution is not to make an ad-hoc exception but to reconsider if spelling existential types as if they were generic Any is a good idea.

2 Likes

What magical properties? Prior to Swift 3, Any was literally just a type alias for protocol<>, the existential with zero protocol requirements. Are there other magical properties that have been introduced since then?

It would be special syntax. I'm not proposing a separate generic type distinct from Any; rather, I'm proposing an extension of Any

some P does not introduce a new wrapper type, though. It's effectively an "opened" existential, as I understand it.

There's also been discussion (in the same generics UI thread) about something like Collection<.Element == Int>. Do you also disagree with the use of angle brackets in that case?

Not really. In the LHS, T is a type. And any is used to make a type out of non-type entity - protocol or archetype. Keyword any cannot be applied to something which is already a type.

Unless we introduce some sort of kinds: typealias MyAny<T: Protocol> = any T

Further, my goal here is to make Swift more understandable for all users, in part by adding a clearer spelling for use in error messages. Adding MyAny<T> = any T doesn't produce clearer error messages at all.

1 Like

I disagree here, as you can theoretically wrap every possible instance of any type into an existential. The existential will have the behavior depending of the type it wraps. There is no need to limit this to protocol-as-a-type.

class A {}
class B: A {}

let a: A = B()
//     ^ what is this container? In my eyes, this is `any A` existential.

Theoretically you can write any Any and with the preferred syntax from the pitch author Any<Any>, but the latter could potentially confuse the compiler.

struct T {}

let t: any T = T() // theoretically okay, but useless for structs

To me, the most straightforward solution here is to allow Animal to be a concrete type conforming to itself when this wouldn't cause any problems, i.e. when it doesn't have static, init, Self, or associatedtype requirements (the latter two cases already disallowed). We could allow protocols to be valid types only when these aren't present, and require Any<P> otherwise.

I picked projects at random from the Source Compatibility Suite and looked at the protocols they defined; they mostly didn't use init or static requirements, and when they did, it's mostly in protocols that also use Self or associatedtype. It would be a shame to make a language feature more obscure for the sake of an uncommon use case.

1 Like

Absent some technical reason why this wouldn't work, I'm pretty much in agreement. We already have the one-off Error self-conformance and countless questions about this (unnecessary, IIUC) limitation, so it's not as if the utility of such a feature is in question. In my mind the biggest questions around this feature are:

  1. Does the additional diagnostic "protocols with init or static func requirements cannot self-conform" leave us in a worse place than the existing "protocols cannot conform to other protocols" diagnostic?
  2. Is such a self-conformance implicitly supplied simply by declaring the protocol?
  3. If not, how should the self-conformance be spelled in source?
  4. If we offer a way to spell self-conformance explicitly, should we extend this syntax to allow protocol types to conform to other 'simple' protocols?

In the interest of not derailing @bjhomer's thread, I suggest that this line of discussion be diverted to an alternate thread where the pros and cons of generalizing self-conformance can be discussed in-depth.

3 Likes

Separating the spelling of existential type from the spelling of protocols is important and good.

I'm thinking of existential types as variables of protocol type, where you have a name in source explicitly declared as something that conforms to a protocol, like var x : P . As opposed to protocol definitions, which look like protocol name {} That might be imprecise still but it helps me to be concrete.

I'm not an expert and have been generally wrestling with this mentally for a while. So some of this might reflect my own ignorance or partially-understood things.

A couple of notes:

  1. I don't know how to measure how much of the confusion around this is because of using the same spelling for the two things though. Opaque types with some are straightforward but took me some time to really understand. Some of the issue is that the terms opaque and existential have precise meanings, but the regular words opaque and existential also mean something related but different.

  2. @Nickolas_Pohilets already said this but I also agree that Any<P> looks like P is generic. I think people (me) are already confused some of the time about what's a generic and what's another kind of type. I tend to think that angle brackets should mean generic parameters, and everything else should have different syntax. That would be roughly consistent with the idea that protocols can't have angle brackets, only associatedtype, and so on. Using any T instead of Any<T> keeps that distinction.

  3. The word any means a lot of things in English. Does it make sense to explicitly say var anyAnimal: existential Animal instead of var anyAnimal: Any<Animal> or var anyAnimal: any Animal ? Using a more precise keyword might avoid some confusion. The some keyword for opaque types is not too bad but I did have to go back and read the documentation a few times and write a bunch of test code to be sure I understood it. If it had been opaque T instead of some T it might be clunkier at first but more precise once I understood it. I actually like some because it's shorter and easier but it's worth thinking about.

Just to underline, I think changing the syntax to separate protocols from existential types is really important no matter how you do it.

I think it would be better to lump init/static protocols in with Self/associatedtype protocols than to bifurcate protocols into a yet third invisibly differentiated kind with idiosyncratic type system interactions.

This is relevant to this thread because types for init/static protocols should be replaced with something (even before we get existentials for Self/associatedtype) so we need some new syntax for that. (Any<P> seems good to me.)