[Pitch] Implicitly opening existentials

Hi all,

I've been looking into generics ergonomics, and one of the problems that keeps coming up is that going from concrete types or generics into existentials is a bit of a one-way street: you can create an any P from a value of any type that conforms to P, but once you have that existential value of type any P it's hard to get back to using generics. Here's a quick example:

protocol P { 
  associatedtype A 
  func getA() -> A 
} 

func takeP<T: P>(_ value: T) { }

func test(p: any P) { 
  takeP(p) // error: 'P' as a type cannot conform to itself 
}

Once you get here, you are somewhat trapped, because the only ways to get from existentials back to generics are to refactor to use generics through, manually implement a type eraser, or put your functionality into a method on the protocol. The last looks like this:

extension P {
  func doTakeP() {
    takeP(self)
  }
}

func test(p: any P) { 
  p.doTakeP() // okay, "opens" the existential so the protocol's Self binds to the underlying type of the existential value
}

I propose to extend the implicit "opening" behavior applied when accessing a member of a protocol to generic functions and when initializing a property with opaque type, allowing one to more easily get out of the existential "trap". There are a lot of details, which I've put into the full proposal. I'm also iterating on an implementation to prove out the ideas.

Thoughts?

Doug

55 Likes

Having recently fallen into this trap (and used the "add a method to the Protocol" escape hatch), I'm hugely in favor of a path to allow stepping back from the purely existential type into a generic type.

4 Likes

Will this have any performance penalty compared to "proper" generic function?

I am wondering if this creates another case, where it can be easily overlooked and misused, like existential protocols were, and we will have to make adjustment like SE-0335

1 Like

At first glance, I love it. This feels like something that should "just work". A couple questions that came to mind as I read through:

func checkFinaleReadinessOpenCoded(costumes: [any Costume]) -> Bool {
  for costume: some Costume in costumes {   // implicit generic parameter binds to underlying type of each costume
    let costumeWithBells = costume.withBells() // returned type is the same 'some Costume' as 'costume'
    if !costume.hasSameAdornments(costumeWithBells) { // okay, 'costume' and 'costumeWithBells' have the same type
      return false
    }
  }
  
  return true
}

Is there a reason we need the some Costume here? IMO it would be great if we had this behavior out-of-the box without requiring users to specify a type annotation in the for pattern—I suspect many Swift programmers don't even know this is possible. But I haven't thought through whether auto-opening existentials in for loops would have the same change-of-behavior issues as noted below, or whether there are any trapdoors to avoid those issues.

My primary concern with this proposal is the failure cases:

func cannotOpen1<T: P>(_ array: [T]) { .. }
func cannotOpen2<T: P>(_ a: T, _ b: T) { ... }
func cannotOpen3<T: P>(_ values: T...) { ... }

struct X<T> { }
func cannotOpen4<T: P>(_ x: X<T>) { }

func cannotOpen5<T: P>(_ x: T, _ a: T.A) { }

func cannotOpen6<T: P>(_ x: T?) { }

func testCannotOpenMultiple(array: [any P], p1: any P, p2: any P, xp: X<any P>, pOpt: (any P)?) {
  cannotOpen1(array)         // each element in the array can have a different underlying type, so we cannot open
  cannotOpen2(p1, p2)        // p1 and p2 can have different underlying types, so there is no consistent binding for 'T'
  cannotOpen3(p1, p2)        // similar to the case above, p1 and p2 have different types, so we cannot open them
  cannotOpen4(xp)            // cannot open the existential in 'X<any P>' there isn't a specific value there.
  cannotOpen5(p1, p2.getA()) // cannot open either argument because 'T' is used in both parameters
  cannotOpen6(pOpt)         // cannot open the existential in '(any P)?' because it might be nil, so there would not be an underlying type
}

The failures here are, IMO, even more subtle than the blanket 'P' as a type cannot conform to itself error, so I am slightly concerned that we'd be leading the user even further down the path of using existentials only to find at an even later point that their type design might not work. I don't really have suggestions here, and my hunch here is vague, but I wanted to raise it for discussion. (And I think the call on opening optional existentials is probably correct, for the same reasons.)

I think it's probably the case, though, that the problem more-subtle error cases is outweighed by the benefit this would bring in the common cases.

However, because the return value is permitted a conversion to erase to an existential type, optionals, tuples, and even arrays are permitted:

This list is non-exhaustive, I assume (i.e., other existing conversions here will be supported as well)?

This proposal has two effects on source compatibility. The first is that calls to generic functions that would previously have been ill-formed (e.g., they would fail because any P does not conform to P) but now become well-formed. For the most part, this makes ill-formed code well-formed, so it doesn't affect existing source code. As with any such change, it's possible that overload resolution that would have succeeded before will now pick a different function.
...
The second effect on source compatibility involves generic calls that do succeed prior to this change by binding the generic parameter to the existential box.

Changing the meaning of existing code seems undesirable to me. Is there any reason why this proposal wouldn't adopt the appropriate ranking rules to maintain existing behavior? We could prefer overloads which don't require existential opening to those that do, and prefer to bind a generic argument to the existential over the underlying type when both are possible. I know that at least on the second front the proposal says that the change in semantics is probably a win, but I'm still a bit wary of phrases like "unlikely to be affected"...

It seems like we could at least error/warn about these potential ambiguities and then offer "as P" or "as some P" to force the user to disambiguate themselves, while still getting most of the "wins" in the common, non-ambiguous cases.

Overall, I'm really excited by this!

2 Likes

This is great and I'm definitely in favor. I am a little concerned about the potential for confusion in the checkFinaleReadinessOpenCoded example with the 'any Costume' isn't necessarily the same type as 'any Costume' error.

What if, in the same way that implicitly-unwrapped-optional types (Foo!) become simply optional types (Foo?) during type inference, any P became some P during type inference? That is, in the declaration of costume in costumes as well as in all the further examples, if a variable's type isn't explicit, it is the opened some P type rather than staying any P? In this example, that makes things even more ergonomic. Are there other situations I'm not thinking of where it would be problematic?

3 Likes

I think you're asking whether there is a performance penalty for

func f(_: any P) { }

vs

func f<T: P>(_: T) { }

and the answer is still generally going to be "yes", because with the first one you're passing around the "box" and in the second you've separated the value from its type. This is easier to see with [any P] vs. [T]: in the former case, you have an array of boxes (each could be different), and in the latter case you have an array of Ts with no extra boxing.

I think SE-0335 addressed the issue with existential types sufficiently, and this proposal doesn't introduce any additional performance cliffs that we'll want to call out. This proposal might make things a little better, because when you pass an existential value into a generic function, you open it up and the generic works directly on the underlying value---not the box.

Doug

9 Likes

It's worth noting that this proposal will hopefully reduce the occasions when someone needs to write a function that takes an existential instead of a generic function, and is therefore slower when you pass it a concrete value, because today people feel they have to write the existential version sometimes because sometimes they have an existential and can't pass that into a generic function.

6 Likes

This is an interesting and attractive direction, but I share the concerns of @Jumhyn that it may lead users further down a path that initially Just Works, only to be forced to reckon with the distinction between existentials and opaque types a little later when they’re in deeper.

I see two parts to this proposal: (1) expanding where existentials can be opened beyond the protocol extension method trick—super great, hard to argue against, and on its own a purely additive change; and (2) doing this implicitly, with rules that are almost like a heuristic—which I see as separable and the source of much nuance and complexity (and, as noted in the text, possible source compatibility issues due to overload resolution changes).

The proposal justifies the implicit opening approach by characterizing an explicit opening alternative as “narrower” and saying “[a] programmer who has an existential and wishes to use a generic function would need to learn about opaque result types and their differences with existentials to do so” if the explicit opening alternative were instead adopted.

…And yet, the implicit approach pitched here is the one that includes the example @Jumhyn notes above, required so that 'costume' and 'costumeWithBells' have the same type:

for costume: some Costume in costumes {
  ...
}

To my mind, this explicit type annotation is really an explicit opening approach, and it demonstrates actually how explicit opening can be regarded as more general (not narrower) from that vantage—and, in that particular circumstance, what the user actually wants (as opposed to the implicit heuristic which erases the same-type relationship between ‘costume’ and ‘costumeWithBells’).

Elsewhere, I’ve proposed enabling not just a type annotation costume: some Costume but also expanding the use of as for this purpose. Notably, such a syntax would allow for explicit opening without creating a separate binding:

func hasExistentialP(p: any P) {
  takesP(p as some P)
  // No need to write:
  //   let openedP: some P = p
  //   takesP(openedP)
}

Without the friction of having to create a separate binding, I think it makes the explicit opening approach a viable alternative. Yes, users would have to distinguish any P from some P from the get-go, but the argument above is that users will often have to do so beyond the most trivial usages. If, later on, we find that the more general explicit opening approach is too wordy in a number of widely used scenarios, we could always come back to consider implicit opening as syntactic sugar, just like we have done for some P in parameter lists.

16 Likes

This is a great improvement. It feels like a no brainer. Just one those generic quirks we have gotten used to finding work arounds for out of the way! Will make it easier for newcomers to the language as well :+1:t2:

1 Like

Right, and @gregtitus had a similar idea about making some P the default produced by inference.

I was thinking of this as a future direction, to go along with the ideas around having (e.g.) parameters of protocol type be some by default rather than any, and probably should have written it up as such. My concern here is that we're likely to break more existing source because we're changing the types of declarations. This example is silly, but shows the potential for an issue:

func f(values: [any P], tail: any P) {
  for value in values {  // assume value has type "some P" rather than "any P"
    var localValues = [value]   // localValues will have type [some P], rather than [any P] 
    localValues.append(tail) // error: cannot append "any P" to array of type "some P"
  }
}

I've mentally put this in the "want to do but requires a major language version" bucket, but perhaps this kind of code is rare enough that it doesn't matter.

I absolutely think it's worth pursuing this.

Yes, I worry about this, too. Will programmers get even further down the path and get stuck either completely (have to refactor) or hit a complexity cliff (deep-dive into existentials), and be worse off than if we had stopped them earlier? I'm not sure I trust my intuition here, though... going (way) back in Swift's history, we added

error: Protocol can only be used as a generic constraint because it has 'Self' or associated type requirements

to try to prevent exactly this kind of trap, and in retrospect it was not a great decision because there is so much you should be able to do with a value of existential type. It's really SE-0309 that let folks go farther down this path... and my pitch here is really just adding more ways to get out of the potential trap.

Yes, I was trying to give examples. I'll tweak the phrasing.

I think the new semantics is better, so if we added a ranking rule it's something we'd want to phase out again later (e.g., with Swift 6). I'd rather not have to do that, but it depends a bit on whether the early indications that this isn't really breaking in practice hold up as we throw more code at this change.

Doug

7 Likes

Should you be able to open an existential in a regular variable binding too, as in:

var foo: any P = ...
var openedFoo: some P = foo

The fact that for x: some P in collectionOfP works suggests that would work too, but the proposal doesn't mention other variable bindings.

10 Likes

This is a fascinating take on the proposal. One could call the proposed

let openedP: some P = <something existential>

and explicit form (1), and then the rules around generic calls are the implicit form.

I'd missed this suggestion before (sorry), but it's really interesting: it takes away all of the heuristics by making the opening explicit, and the rule for "when can you open?" is simply "when you have a value of any type".

I think it follows fairly directly that you'd want to be able to do something like x as? some P, which dynamically tries to cast to P and then gives you back a value of type (some P)?... which is far nicer than casting to (any P)? because you already have that identity.

I suspect that folks would get introduced to as some P through error messages: if they try to do something that requires any P to conform to P (or similar), and they fall into a case where the heuristics I spelled out make opening likely to work, the compiler could suggest as some P.

One interesting thing here is whether or where we would type-erase when there's an explicitly opened existential. For example:

protocol P {
  associatedtype A: Q
}

func f<T: P>(_ value: T) -> T.A { ... }

func g(_ p: any P) {
  let a = f(p as some P)  // is 'a' of type 'any Q' or 'some Q'?
}

I guess the most useful answer is 'some Q', and one would need a type annotation to get an any Q.

Oh, that's quite an oversight on my part. It's only mentioned in passing way down in Alternatives Considered, but it's meant to be part of the proposal. I'll update the text now.

Doug

15 Likes

I do really like the as/as? some P syntax, but I'm not sure it satisfies me as a replacement for implicit opening as-proposed. My previously noted concerns notwithstanding, I would really really like it if we could have implicit opening 'just work.'

The way I see it, when we attempt to pass any P to a some P parameter, there's really only one reasonable thing to do, which is unnecessarily difficult today. I suspect that if we require as some P whenever this comes up, I would still routinely pass values of type any P in places where some P is expected, and then hit the compiler hiccup and have to perform a purely mechanical transformation in order to finish. A value of type any P really is some P.

In my head this behavior is similar to optional promotion from T to T?. If we have a value of type T and a parameter expects "a T or nothing," of course we should be able to just pass a T. Requiring us to wrap all such uses with .some( ) would needlessly disrupt programmers' flow when the compiler already 'knows' the correct thing to do.

Of course, optional promotions are likely an order of magnitude or two more common than the implicit opening of existentials would be, but I still think that in situations like this where there's no other reasonable behavior, it makes sense for the compiler to just... go ahead and do it for us.

The rule for "when can any P be converted to some P?" in the implicit case would still be "when you have a value of type any P", and I don't think that examples like cannotOpen2(p1, p2) from the proposal text are dramatically more explainable with as some P:

cannotOpen2(p1 as some P, p2 as some P) // still no good, not obvious why

That said, as I mentioned in my last post, I do really like the as some P syntax in cases where disambiguation is needed, and I think the as? extension you've brought up would be a nice little feature as well. If there's situations where we wouldn't always be able to implicitly open, I think having an inline syntax for it (instead of pulling out into let s: some P = ...) would be extremely useful.

If we treated such situations as true ambiguities that require programmer intervention, then we wouldn't have to worry about silently changing any behavior, and we could still guide the user towards the 'better' option.

Even if we accept the behavior change, though, should we provide a trapdoor for users to prevent opening in cases where they really do want to pass the existential type as a generic argument? E.g., we could still have under this proposal:

func acceptsBox<T>(_ value: T) { /* ... */ }

func passBox(p: any P) {
  acceptsBox(p) // 'T' inferred as underlying type of 'p'
  acceptsBox(p as P) // 'T' inferred as 'P'
}
5 Likes

One thing I am concerned about is self-conforming protocols.
Would there ever be a reason to use the type any SelfConformingProtocol as some SelfConformingProtocol without unboxing it first, if there is, what syntax would it use:

func f<T: Error>(_ value: T) -> T { ... }

func g(_ p: any Error) {
  let a = f(p)  // is 'a' of type 'any Error', 'some Error (hiding `any Error`)' or 'some Error (hiding type boxed by `p`)'?
  let b = f(p as some Error)  // is 'b' of type 'any Error', 'some Error (hiding `any Error`)' or 'some Error (hiding type boxed by `p`)'?
}
3 Likes

I think any implicit opening (like any other implicit conversion) has to be ranked less good a match compared to doing nothing at all, such that f(p) would continue to get an argument of type any Error just as it does now.

By contrast, f(p as some Error)—if we agree with the idea to support this syntax—would open the existential (irrespective of whether the existential box itself conforms) and thus pass an argument of opaque type some Error concealing the underlying type of p rather than the existential box.

2 Likes

FWIW, I don't think this kind of code would be very common. Most of the time when inferring as some P was wrong the compiler would just implicitly wrap it back up in an any box like today.

In addition, as some P types are becoming more common in Swift (through this and several other changes lately), it makes sense to beef up error messages in this area as well. One of the only places left where any P will be useful is in heterogenous collections like the array here, so hopefully we can get targeted fixits suggesting explicit localValues: [any P] in code like this.

Since I'd argue that significant error message / fixit support for that kind of code (combining multiple not-necessarily-the-same some P) is going to need to happen anyway, it's worth taking the potential breakage and exercising those fixits in order to avoid a bunch of programmers spelling out : some P for this feature and then removing it again in the next major language version.

2 Likes

If Never is bottom type,
when (any P)? is dynamically nil,
could we open it as Never?

Overall in favor of that, seems like a good win and a great solution :clap:

—-

Quick questions just to ensure I didn’t misunderstand anything when reading through and that those are just small typos that slipped thru… as opposed to me missing a key point :sweat_smile::

  1. In the last paragraph of “Proposed Solution”
func explicitlyOpen(p: any P) {
  // Is that really the expected syntax?
  let openedP: some P = any P
  // Or is that just a typo and it shoul be this instead?
  let openedP: some P = p
}
  1. In “When can we open an existential?”
// Extra ‘?’ is a typo here, right? Not some fancy syntax I missed? (I am a bit behind in following all the proposals, so…)
func openInOut<T: P>?(_ value: inout T) { }
  1. And a couple of lines after, just to confirm I got it right:
cannotOpen4(xp)            // cannot open the existential in 'X<any P>' there isn't a specific value there.

The reason we cannot open here is not because of the structural position of the existential in the type, but because T is a phantom type on struct X<T> {}, right (hence the “no specific value” in your explanation)? If we had struct X2<T> { let t: T } instead, the call would work and we could then open the X2<any P> in that case, correct?

If that’s indeed so, I think it could help to add the X2 example as well (in the “can open” section) to clarify the distinction :upside_down_face:

Thanks for the confidence check :slightly_smiling_face:

2 Likes

Another thing worth mentioning in this proposal is that it eliminates the special-case treatment of type(of:). The way type(of: concrete) gives you a non-existential Concrete.Type, but type(of: existential) gives you the existential any (Protocol.Type), would fall out from opening the existential and covariantly erasing the result of the one generic function type<T>(of: T) -> T.Type.

15 Likes

If the implicit behavior was unwrapping, wouldn’t that cause a change in behavior on existing code?