Lifting the "Self or associated type" constraint on existentials

Sure users want the limitations lifted, but most of them don't understand that some of the limitations are inherent, and generalized existentials won't eliminate the pain.

I loathe writing type-erasing wrappers as much as the next guy. But generalized existentials don't eliminate the need to write them, nor (AFAICS) do they make it much easier to do so. An existential Collection type would not remotely replace AnyCollection, because that type would not conform to Collection. In fact, most basic things we expect from a Collection would not be available: you can't index it, and first() could at best return Any?.

Nobody has proposed a set of features really targeted at the pain points you cite; that would be a really interesting exploration (I suspect the answer has at least as much to do with solving what I call “the API forwarding problem” as it does with existentials). Even if what is being proposed makes incremental progress toward solving those problems, though, I fear that in the specific place we arrive, having applied that increment, we'll have done more harm than good as explained in my opening message.

I totally understand we have problems with the limitations on existentials, but before we charge ahead lifting limitations IMO we have to consider how the whole language works and fits together once we've done that. The future I see when I do that mental exercise has what I consider some serious new problems—problems that aren't addressed by simply saying “the limitations are bad so we should lift them.”

11 Likes

No it is not. Please see the example P I gave in this post, along with a complete explanation of why the existential type P can never support the API required by protocol P.

they will be drawn toward the syntactically lightweight use, which I think is more often inappropriate.

I would argue that points to a flaw in the language that should be solved

I agree; it should be solved if possible, but…

rather than artificially imposing pain on users because we think we know better.

…the pain is not artificial; AFAICT it's inherent, as I hope my example shows.

It doesn't go nearly far enough.

I agree again. We should take a good look at the actual use cases that create what you call a “mile-high wall of overhanging granite” and design language features (or write educational articles, if that is more appropriate to our analysis) that address those needs. Incrementally chipping away at the restrictions on existentials does not necessarily seem like it leads to a good answer, and in the meantime, as I have mentioned elsewhere, just doing that could leave us with some serious new problems.

IMHO these concepts are impossible for mortal programmers to grasp because the bar for working with them is so high.

I don't think so. The problem is simple: we've created a confusing language feature by making the creation of existential types implicit. There are three sources of confusion that I know of:

  1. Sometimes when you declare a protocol, you also get an existential type that you can use to handle instances of any conforming type. Sometimes, though, depending on details of how you declared the protocol's interface, you don't get an existential type.
  2. Even when you do get an existential type, it doesn't conform to the protocol.
  3. Also, some parts of the protocol's API may be unavailable on the existential type (rare today, but true for extension methods with Self arguments).

Generalizing existentials in the way proposed means that the compiler would no longer bite you right away when you try to use an existential type; hooray! But instead, the compiler will bite you when you try to use the parts of the API that today are preventing us from creating the existential type. That takes away confusion #1 but compounds confusion #3. That could be a much worse place to be than the situation we have today for reasons cited in my first post.

I'm also not sold on the idea that existentials must impose a performance cost; maybe static compilation makes some form of specialization for an existential (aka non-generic) function impossible but is that true for all cases or just some (If we were to admit JIT ala JS then it is definitely possible to provide dynamic specializations of a function, switched based on argument type). I'm not saying Swift could, would, or should do this, just muttering questions out loud.

Just some: when the compiler can “see” all the types involved, it can use the information (e.g. knowledge that two types must be the same) to optimize code just as well as if that constraint were captured in a generic. But that can only happen under special conditions and trying to broaden the cases that get optimized usually requires optimizer heroics (a big development investment) and increased compile time (which is bad for end users). To be fair, most cases with resilient generics will not optimize well, either.

So, yes: using existentials where you could use generic constraints implies a performance cost in the general case. Applied without careful discrimination over the majority of Swift programs, it cannot help but be significant. (And, yes, Swift should probably have a JIT)

I expect the addition of opaque result types to satisfy the majority of needs for which people want to use existentials today

It might solve many of the cases where protocol methods return Self but that is only a subset (I'll admit I don't know how big that subset is in realistic codebases; maybe it is far larger than I think?).

? I don't know of any special applicability to protocol methods that return Self. Opaque result types cover all cases where an existential would be used purely to avoid exposing the actual type of a return value as API.

5 Likes

It's not a problem because nobody expects Any to have usable API.

people very much like the ability of Swift to statically enforce proper typing of data. The point of using protocols as types is to reduce the need for dynamic checks or strong coupling by allowing more precise forms of erasure.

Yeah, sure. But under the proposal, Swift won't let you be “precise” about which APIs get erased on the existential type. It will decide for you, based on some rules of type soundness that most people can't and/or legitimately don't want to understand. That problem exists on the margins today, and will get worse under the proposal.

As for compiler optimisations, there is no intrinsic reason an existential must be less optimisable than a generic parameter.

Yes, for all practical purposes, there is. Not in every case, but in the general case.

For example, we have a (relatively new) ExistentialSpecializer optimisation pass which transforms functions of the form func f(_: P) to func f<T:P>(_: T) , at which point our other generic specialisation infrastructure can take over. https://github.com/apple/swift/blob/a4103126b309549166182744265991c9a4db2819/lib/SILOptimizer/FunctionSignatureTransforms/ExistentialSpecializer.cpp

And yet it can't produce optimal code for func f(_: P, _: P) if it would otherwise have been written as func f<T:P>(_: T, _: T).

Protocol existential self-conformance is a massive issue, though. I'll come back to this when I have some time to elaborate, but basically, I think it's a flaw in the design of the type-system. The type of an existential should be some kind of type-existential (i.e. "any type which conforms to P") rather than the protocol type P itself.

That begins to get at the root of the problem I am talking about. The point of confusion is that for many protocols P, the existential P cannot satisfy the requirements of P. Note that this is inherent as long as you allow init requirements, for reasons I have outlined in another post.

Opaque result types are a great grooming tool for API authors, but they wouldn't solve my most pressing need for beefed-up protocol existentials, which is that I sometimes need to store something which may be populated by a variety of different types (e.g. var myThing: Collection where Element == String ). Generic parameters are awkward for this - if this was inside a struct, MyStruct<X> and MyStruct<Y> would have different layouts and could not be substituted.

I can't quite visualize your example “inside the struct.” Care to elaborate? Also I don't know what you mean about generic parameters being “awkward for this;” I'd have thought they simply wouldn't work in the case described. But yeah, I agree that opaque result types wouldn't solve your problem, which requires either type erasure or maybe rethinking your approach at a higher level to make type erasure unnecessary (not claiming the latter is possible, BTW).

2 Likes

Do I understand this correctly: In a world where we have generalized existentials some function func foo(_: P) can never instantiate a new instance because P can be any conforming type in the program, thus an init() requirement can never be satisfied. Hence P doesn't conform to P?

A protocol with type parameters (Self, init() which can be thought of as func init() -> Self, and associated types) is incomplete by definition. Working with these kinds of existentials has some inherent language design issues that need to be addressed.

I'm coming around to your way of thinking, re: an explicit annotation on the protocol or a different spelling like Any<P> that highlights the differences.

I'm not convinced that this is as bad an option as you see it to be. In my eyes, it moves any errors or issues closer to the places where they're relevant and therefore easily explained. For example, rather than being presented with a message that a protocol can't be used as an existential, you're instead presented with a message that a particular method can't be called since the type is unknown (or else is erased to Any where possible). That then naturally leads to imposing more specific constraints (i.e. the <T : Protocol> syntax) to get to a point where that information is available.

Fundamentally, this proposal just shifts the error closer to the actual problem (i.e. you don't know what the type is) while enabling use-cases that currently require cumbersome workarounds.

9 Likes

I would agree with @dabrahams that such an eventuality would be strictly, and significantly, worse than the status quo for the reasons below. This is why I said upthread that lifting the "Self or associated type" restriction should happen only in tandem with significantly improved diagnostics that allow users to avoid the scenario you outline above.

Certainly there are some "use cases that currently require cumbersome workarounds" that would be enabled by lifting the restriction, and I would very much like to be able to enjoy that functionality. However, @dabrahams outlines above why lifting the "Self or associated type" restriction won't actually enable or enhance a large portion of use cases that people have mentioned even in this thread, such as an existential collection type replacing AnyCollection.

What we often see in the "Using Swift" portion of these forums is that users reach for existential types when they should be using generic contraints--"should" not merely for performance reasons, but because they truly do not need or intend for any type erasure and often do intend to access APIs that require the type relationships being erased. That they run into the "Self or associated type" restriction now and would run into the "method can't be called" issue in the future is not the actual problem but only a symptom of that problem (i.e., using existential types instead of generic constraints).

Today, users are told upfront of this fact if they are dealing with a protocol with Self or associated type constraints. Without the "Self or associated type" restriction, then, more uses of existential types by the typical user would fall into the category of problems that would be best served by features other than existential types. This becomes even more so the case if/when opaque types and other enhancements are added to the language. Given the limited extent to which intentional use cases would actually be enabled by lifting this restriction, one must be careful that it's not outweighed dramatically by the extent to which unintentional use cases would be encouraged--and "unintentional" here referring not to the intentions of language designers but to the intentions of the user who actually does not want or might not even know about the type erasure that's going on.

One component of solving this problem might be to change the spelling so that Any<P> (or, for reasons that will become apparent below, I'll use an alternative strawman syntax Existential<P>) rather than P is the existential type. The goal here is to reduce as much as possible the scenario where users reach for existential types without even realizing that they are doing so. I have to admit that, even after years of working with the language, I still catch myself sometimes unintentionally using an existential type when I meant to have a generic constraint!

A spelling such as Existential<P> neatly avoids the baffling situation that "P does not conform to P," since even on visual inspection it's clear that Foo<Bar> has no reason to conform automatically to Bar.

I'd imagine it could then be possible for authors to conform Existential<P> to P by manually implementing the necessary methods in an extension (i.e., extension Existential where Protocol == P). (If the existential type were to be spelled Any<P>, then extension Any where Protocol == P would naturally prompt the question of whether one can extend Any without constraints, which is a different topic altogether best avoided here.)

5 Likes

An initializer or static method requirement doesn’t technically preclude a protocol type from self-conforming as long as there’s at least one conforming type: the protocol could just pick that type and construct it / call the method on it. But if that type isn’t unique (which is reasonable to assume a priori), picking one type in specific would be an arbitrary choice, so as a policy matter it doesn’t make sense to allow it. So sure, maybe with an annotation it could be done if there’s really a reasonable default that wouldn’t cause more confusion than it saved.

Don’t think about it in terms of a function that takes an actual value of the protocol type. Think of a generic function over T: P. What actually happens if you use a particular requirement when T is dynamically the protocol type P itself?

3 Likes

I haven't had time to properly digest and contemplate the argument @dabrahams is making yet so nothing I say here should be considered as a direct response to that. However I do want to point out now that the statement above is simply not true.

Lifting the restriction would significantly simplify designs that store type-erase values and use various dispatching strategies to interact with the existential. The current workarounds I'm aware of rely on introducing an additional protocol which can be used as an existential and dispatching through that. Lifting the restriction would allow storage, casting and dispatching to happen directly on the PAT protocol itself which would streamline designs significantly. I have kept this pitch in mind since it began and have already run into several use cases where it would be extremely handy.

I don't have an opinion on this syntactic change yet but I don't buy the argument that it will reduce accidental use of existentials. The reason users often reach for existentials is because many programmers are most familiar with Objective-C protocols or interfaces from other object-oriented languages. The will reach for a tool that feels familiar in this way regardless of the syntax used to invoke that tool.

Allowing extensions on the existential would be a really useful way of allowing existentials to conform to protocols (including their defining protocol). On the other hand, allowing extensions on existentials could introduce significant confusion between those and protocol extensions. I think we need to study the use cases and consider alternative solutions closely before heading too far down that path.

7 Likes

Yes, indeed, I too am very excited about the fact that lifting this restriction would significantly simplify designs that store type-erased values. However, that does not change the fact that many uses discussed above do not fall into the category of things that would be simplified by lifting the "Self or associated type" restriction. Just quickly scrolling through some of the items that people mentioned here:

  • @karim mentioned Equatable conformance for existential types: lifting the restriction would not allow that
  • @dmcyk mentioned not having to create type-erased boxes: lifting the restriction would not allow that (for reasons @dabrahams outlines above)
  • @Karl mentioned replacing AnyHashable with a thin wrapper around Hashable: lifting the restriction would not allow that
  • @rbishop mentioned self-conformance for existential types: lifting the restriction would not allow that
4 Likes

That's a nice positive story to tell ourselves about the potential outcome, and I might even be inclined to believe it if we had a list of actual use-cases that would demonstrably become much nicer (hint, hint).

The problem is, if it just (as you say) “shifts the error”, then it necessarily doesn't remove the fundamental source of confusion around protocols and their existential types. If what @xwu says is true, that most people run into this wall in places where even a generalized existential would be a poor fit for the use case, then the wall—as frustrating as it might be—is actually performing a valuable service.

5 Likes

Although @John_McCall described a special case where the init() can reasonably be satisfied, and ways, where—usually—it can be satisfied arbitrarily (but IMO unreasonably), in the general case, IIUC when no types conform to the protocol, it can't be satisfied.

Anyway, although init() is simple to understand it might not be as much of a killer example as func f(_: A) -> A.

If your example would be enhanced by this proposal it would be very instructive to see how the code could be improved were the proposal accepted. I note, however, that the technique you showed is generally useful even where there are no existentials; I use it that way to deal with heterogeneous “collections” of similar items without type erasure, and I'm pretty certain my use case would see no benefit from generalizing existentials. I point that out because I think if it can be much better there's probably a more general feature that would benefit both of us.

If we are to believe all the grumbling we've heard about angle brackets, changing the syntax could easily be enough to make existentials not “feel familiar.” But doing that alone strikes me as a strictly punitive approach that I'd like to avoid. I would like to also address the fundamental confusion, increasing expressivity for protocol authors and comprehensibility and predictability for protocol users.

I'm keen to agree with @dabrahams here. The way I understand it, shifting the error can be worse than the status quo because the amount of refactoring that has to be performed in order to remove the error can be dramatically bigger.

Lifting the constraint wouldn’t indeed help to avoid creating type erased boxes, but it could somewhat simplify it when working with members that don’t use Self/associated types.
e.g. (primary for being able to mock things) I often write such wrappers:

protocol Foo {
  associatedtype Bar

  var x: Int { get }
  var y: Int { get }
}

struct AnyFoo<T>: Foo {
  typealias Bar = T
  private let _getX: () -> Int
  private let _getY: () -> Int

  var x: Int { return _getX() }
  var y: Int { return _getY() }

  init<K: Foo>(_ val: K) where K.Bar == T {
    self._getX = { return val.x }
    self._getY = { return val.y }
  }
}

Being able to use simple type members would decrease memory footprint of such wrapper and simplify the code.

Technically, is there a reason func f(_: A) -> A couldn't be mapped to func f(_: Any) -> Any with a runtime trap if the argument isn't of the expected type? By itself this isn't a very satisfactory solution, but I think it's close to what people would expect to happen. And then maybe there's way to improve on that by making the runtime trap clearly visible in the code like in ex.f(a as! ex.A) or something like that.

Although it's true that these fundamental design challenges exists, associated types are the wrong thing to blame for them—it's contravariant requirements that pose these challenges. It seems like a major problem to me that the existing restriction misplaces the blame for the complications. I'm all for deemphasizing type erasure, I absolutely agree that there are better alternatives in most situations and it was a mistake to spell existential types as the bare protocol name. Beyond making incremental progress toward the goal of generalized existentials, I'm more concerned that the state we're in now is actively harmful, and it's also threatening to damage the language design in other areas, such as protocol resilience, if we choose to stay where we are.

15 Likes

IMO, the general thing to support for existentials for which self-conformance is desirable, but there isn't a natural covariant generalization for the protocol requirements, is to allow the existential type to be extended with an explicit conformance. This would be necessary anyway for resilient public protocols that want self-conforming existentials since they would need to promise that they will remain self-conforming if they add new requirements. Then you could think of a covariant existential's conforming as being an automatic derivation rule, while still allowing the automatic derivation to be overridden. For example, for Hashable:

// Straw syntax `Any<P>` for the explicit existential type
extension Any<Hashable>: Hashable {
  static func ==(l: Any<Hashable>, r: Any<Hashable>) -> Bool {
    return AnyHashable(l) == AnyHashable(r)
  }

  func hash(into: inout Hasher) {
    AnyHashable(self).hash(into: &into)
  }
}

However, I think the more commonly useful thing would be to allow for implicit opening of existentials when passed as generic arguments; that strikes me as more likely to be what you mean when passing a single existential value into a generic function, and would be more efficient as well.

1 Like

Has there been any consideration to allow a subset of protocol-as-existential parameter usage for those use cases where it would be possible to produce a transform to generics (in same lines as ExistentialSpecializer)?

If existentials have all the problems and generics don't, then as long as the syntax can be transformed to generics, there's no problem, right? So for example allowing to use associated types or Self, when there is transform available. So func f(_: P) would work right-away, but func f(_: P, _: P) would produce error like "Two or more parameters using protocol existentials is not supported, use generics instead" and then maybe a fixit to guide along the right direction.

There's nothing wrong with writing f(_: P, _: P) though; that's like writing f<T: P, U: P>(_: T, _: U). There's unlikely to be much additional optimization had from constraining both parameters to the same type if the arguments being the same type doesn't matter to the function implementation.

Just to be clear, I meant in the case where being the same type does matter, the version with existentials will have to pay for the dynamic check (and also compromise static type safety) while the generic one will not. That's the cost of prematurely erasing type information.

As far as the diagnostic quality issues @xwu and others brought up, I think we could at least do the following:

  • The historic code for dealing with existentials simply elided unusable requirements from name lookup on the existential type. Now that we have the availability checking infrastructure, we could instead treat these members as unavailable, so that you get a sensible diagnostic as to why the method can't be used.
  • For the common case where someone wrote an existential as an argument, we could offer a fixit to turn it into a generic parameter as part of the availability diagnostic.
11 Likes