Lifting the "Self or associated type" constraint on existentials

Well, but not if you lift the restriction on there being only one archetype in the environment, I think. Maybe it works because the nesting relationships are right.

So we wouldn't be able to replace AnyHashable with a Hashable existential, but it could basically be a thin wrapper around one, right? Something like...

struct AnyHashable: Hashable {
  let val: Hashable

  func hash(into: inout Hasher) { val.hash(into: &hasher) }
  static func ==(lhs: AnyHashable, rhs: AnyHashable) -> Bool { return lhs.val._isEqual(to: rhs.val) }

extension Hashable {
  func _isEqual(to other: Hashable) -> Bool {
    guard let other = other as? Self else { return false }
    return self == other

AnyHashable also has some additional logic to deal with bridging, so that bridged types equate and hash interchangeably.


The current error about ”self or associated type” restriction is huge roadblock for new users trying make code work. It’s obscure, feels arbitrary and is hard to understand even when you try to read it up. Even if you would fail later (since generalized existentials are not there), at least with diagnostics you’d have better chance to understand the limitations, since it’s more specific problem. And there could be more exact workaround for that specific issue. So overall I think this is the step to right direction.


Yes please! I think I might cry for joy. I gather its not the whole story re: PAT usability issues but it sounds good to me.

I’m a pretty seasoned dev and I constantly hit limitations around PATs that i struggle to reason about so my default assumption is always that its best to avoid them. I have specific desires to extend protocols with PATs with derived protocol types that also use those PATs and have other generic functions able to match the base protocol type but then use “is” to check for a more specific subtype. I think this will solve this?

I understand this to mean: currently we may already have to prevent some parts of a protocol's interface (the part provided by extension methods that use Self arguments) from being used on existentials:

$ swift
Welcome to Apple Swift version 4.2 (swiftlang-1000.11.113 clang-1000.11.45.5). Type :help for assistance.
  1> protocol P {}
  2> extension P { func f(_: Self) {} }
  3> func g<T:P>(_ x: T) { x.f(x) } // OK
  4> func h(_ x: P) { x.f(x) }
error: repl.swift:4:18: error: member 'f' cannot be used on value of protocol type 'P'; use a generic constraint instead
func h(_ x: P) { x.f(x) }
                 ^ ~

There are other ways to address this issue (see below).

Regardless, I have some general concerns about going this way and a special concern about doing it right now.

My general concerns are:

  • Our users may not sufficiently understand the difference between using type erasure and using type constraints. Right now the restrictions on existentials tend to prevent unintentional type erasure early on. Because using an existential is syntactically lightweight when compared to using type constraints, and similar to using a base class (which is likely familiar to more people), I fear that with less restrictions on existentials, users will code themselves into inappropriate type erasure by following the path of initial least resistance, and then find themselves “forced” to add complexity in the form of dynamic checks at the points where they need to recover the type information. Also, this premature type erasure will inevitably hide optimization opportunities from the compiler.
  • No matter how general we make our support for existentials, if we only lift restrictions there will always be the fundamental “weirdness” illustrated by the interactive session above: some parts of a protocol's declared interface can be unavailable on instances of the protocol type, with the inevitable consequence that an instance of an arbitrary protocol P will never conform to P. As far as I can tell, that's just really hard to explain to people in a way that makes sense. Right now the issue is something of an edge case but if we make it easier to write code that uses existentials I fear it will become a much bigger problem as people develop a significant investment in large bodies of code that now satisfy the compiler. If we push our support for generalized existentials to the limit, almost everything will seem to work: most things you could do with type constraints could also be done smoothly with type erasure, but there will be these very odd and even-harder-to-explain cases where the compiler stops you. Also, there will be a pervasive performance cost when type erasure is used.

My concern with trying to generalize existentials now is that I expect the addition of opaque result types to satisfy the majority of needs for which people want to use existentials today, and if we do both things at once:

  • people will be confused about which feature to use.
  • they will be drawn toward the syntactically lightweight use, which I think is more often inappropriate.
  • we will not get good data on which of the new features is best able to satisfy our users' needs.

I have a couple of ideas that may help address some of these problems:

  • Spelling all existentials as Any<SomeProtocol> would go some way toward addressing the problem of “least resistance”—though I'm not at all sure it's far enough.
  • I have always thought a big part of our problem is that protocols that are meant to be used for type erasure are fundamentally different from those meant to be used as constraints, yet we declare them the same way. Therefore, the compiler simply has to let us create situations like the one shown at the top, where we extend them with Self arguments or have static requirements that prevent self-conformance. Even when we develop better tools for library evolution, the compiler will have to let us evolve a protocol by adding a defaulted associated type even if that protocol is being used as an existential in some other module—because it can't see the other module. If, in order to be used as an existential, a protocol had to be so annotated:
    existential // strawman syntax
    protocol P {}
    The compiler could prevent the use of Self arguments in extensions, the declaration of static requirements, and the addition of associated types.

Even if both these ideas were implemented, though, I'd still be concerned about doing this within 12 months of adding opaque result types, for reasons stated above.


Why is this? Why would we want to hard-code protocols into two distinct categories?
I have always thought that the ability to use PAT's (Protocols with Associated Types) or Self-constrained protocols as proper existentials is merely lacking current implementation, but is something that will happen in the future. I do not believe that people using protocols should set in stone that which category of protocol it is, I would like to have protocols that can do as much as the current implementation allows, if possible have PATs working in every situation (I guess there was some sort of automatic type erasure even tossed around).

While I like that a programming language encourages good coding habits, I think straightforward and easy to use code is sometimes better than super performant code. There are situations where the most performance is needed, and there needs to be a way to do that, but I wouldn't want to impose complexity on users just for the sake of guaranteeing the most performant code.


FWIW, I don't know the term “PAT.”

As to why, I thought I explained it, but let me try again, in fewer words this time.

  • An existential type is fundamentally different from a type conforming to the corresponding protocol. When existentials are allowed to be created from protocols with arbitrary requirements, we create the “some of P's requirements can't be used on a P” and “P does not conform to P” paradoxes, which are very bad for usability and comprehensibility.
  • We intend to support protocols evolving to add new requirements, as long as they are defaulted. If we allow arbitrary new requirements to be added to a protocol that has been used as an existential, we either have to live with “P does not conform to P,” or adding new requirements with defaults may break existing code.
  • Declaring these protocols differently would allow the compiler to prevent us from creating these difficulties.

“PAT” is an acronym for Protocol with Associated Type. I coined the term in 2015 in a talk at the Swift functional programming conference that year ( Alexis Gallagher - Protocols with Associated Types - YouTube ).

That talk is a sort of a mirror of the confusion of Swift devs like myself at the time, who were excited about Swift protocols and your WWDC talk about protocol-oriented programming but also quite surprised by the limitations concerning associated types. I guessed at the motivations for this aspect of the language, relating it to other languages and to your talk. This was in the days before Swift was open-sourced, when we couldn’t just ask about these things.

I suspect many people, like me, are still a bit confused about the fundamental reason for these limitations. I’d still like to understand it better! (For instance, I am embarrassed to admit, I doubt I could define “existential” to my satisfaction. I bet I’m not the only one.)

Fwiw, I am sure the essential reason for the confusion then was that prototocols with associated types behaved almost exactly unlike protocols in objective-c, which was most everyone’s reference point. I suspect this is the same thing which you are saying more precisely now, when you distinguish protocols-for-defining-constraints and protocols-for-type-erasure, but I’m not sure.


The limitations are because of missing implementation details and incomplete language design rather than any fundamental difference. Where there are complications of the sort Dave raises, where an existential has a different set of behavior or methods from conforming types, that doesn’t even really arise because of associated types, really, but because of covariance and contravariance. It’s been an unfortunate byproduct of the “self or associated type” restriction that it’s introduced this confusion.


Since no one else answered this...

A type's kind is its signature for creating new types. That is, the kind of type constructor specifies the kind(s) of its argument(s); the kind ★ is used for individual types.

A generic function type's rank is its signature for type instantiation. The following algorithm computes the rank of a type; for simplicity we assume all functions are curried. Rank 0 functions are not generic; Rank 1 functions have an outer ∀T introducing some type variable T; Rank n+1 functions are higher-order which receive a generic function of Rank n as their first parameter.

So, a higher-kinded type is nothing more than a type constructor. Usually, these come up in discussions about what types can supposit for a type variable; in most languages, type variables only range over types of kind ★. Easy enough.

A higher-rank type is higher-order function satisfying two conditions: first of all, it is generic already; additionally its argument, a function, is itself also generic (or likewise for its argument, if also function, and so on, for arbitrary levels of nesting). This second condition specifies a nested inner generic context that is in some sense independent of the outer generic context specified in the first condition. The upshot is not just that you can pass around generic functions — such as the polymorphic identity function forall a. a -> a or the polymorphic constant function forall a b. a -> b -> a — but you can use them as polymorphic functions in generic function bodies.



To add something that Alexis might be too modest to say: This term is fairly widely used among third-party developers, particularly those with even a mild interest in using protocols more pervasively. (He also didn't mention that it also encompasses protocols with Self parameters—in other words, "PAT" is a user-invented term for "non-existential protocol".)

Third-party developers use the term "PAT" mainly to articulate their pain about the PAT restrictions and explain how to work around them. The main workarounds recommended are either writing type-erasing wrappers like the standard library's, or weakening the protocol's requirements enough to no longer require Self or associated types.

Needless to say, "PAT" is a term associated with much grumbling. Users don't want these limitations codified; they want them lifted. If we can lift them, we should, and even partially lifting them will be well-received if it makes type-erasing wrappers or other workarounds easier to implement.


Even simple concepts like equality and hashing can be abused in fun ways and lots of framework authors rely on dictionaries and sets internally. Can we ever really avoid breaking existing code when adding new requirements? My money's on "no" but I am willing to be convinced otherwise.

Why would using a protocol as an existential require that "P does not conform to P"? Isn't that a limitation of the current implementation? It doesn't seem like a fundamentally intractable problem but :man_shrugging:

I would argue that points to a flaw in the language that should be solved, rather than artificially imposing pain on users because we think we know better. It certainly isn't a "pit of success" as it stands.

So many problems cry out to be solved with a protocol and associated types; precisely at that moment the gentle slope of learning becomes a mile-high wall of overhanging granite set before those foolish enough to attempt the climb. This pitch opens a small crack; the tiniest of footholds. It doesn't go nearly far enough.

Could you have reversed cause and effect? IMHO these concepts are impossible for mortal programmers to grasp because the bar for working with them is so high. This leaves the user with very little concrete to grasp and it is my theory that the brain is better suited to extracting abstract principles from concrete examples rather than the other way around.

Or more simply: because protocols with associated types have a usability brick wall users avoid them and don't really understand how they work, or they use generics because the compiler said so but rarely develop a deeper understanding. They also tend to design around these limitations and these fundamental design errors are far more difficult to fix than some accidental dynamic dispatch.

I'm also not sold on the idea that existentials must impose a performance cost; maybe static compilation makes some form of specialization for an existential (aka non-generic) function impossible but is that true for all cases or just some? (If we were to admit JIT ala JS then it is definitely possible to provide dynamic specializations of a function, switched based on argument type). I'm not saying Swift could, would, or should do this, just muttering questions out loud.

It might solve many of the cases where protocol methods return Self but that is only a subset (I'll admit I don't know how big that subset is in realistic codebases; maybe it is far larger than I think?).


Huge +1 from me.

I'm struggling to grasp all the language theory involved in this, so sorry for any misunderstandings, I'll have a try.

First, Joe brought up the covariance and contravariance (feel free to point out more suitable resource). An attempt at layman's example:

// Covariance
class Base { func foo() {} }
class Derived: Base { func bar() {} }

let base: Base = Derived() // ok, since Derived is subclass of base, i.e. they are covariant  

// Contravariance
protocol Shape { var area: Double { get } }
struct Circle: Shape { 
    var radius: Double
    var area: Double { return radius * radius * .pi }

// printer only knows about Shapes, not Circles
func printShape(shape: Shape) { print(shape.area) }

func looper(circles: [Circle], operation: (Circle) -> Void) {
  for circle in circles { operation(circle) }

let circles = [Circle(radius: 1), Circle(radius: 2)]
// can pass a function that only knows about Shapes to
// parameter *operation* that expects Circle type, 
// because of contravariance
looper(circles: circles, operation: printShape)  

So with that out of the way, the issue seems to be that

  • Regular (constraint only) protocols are both covariant and contravariant when used as existentials (i.e. instances of protocol type), whereas
  • some instantiations of self or associated type protocols (existentials) cannot provide those same guarantees.


I believe Dave when he says that “P does not conform to P” (i.e. issues with co- and contravariance) can never be fully fixed/solved for self or associated type protocols when used as existentials. However, even if we cannot get 100% there, I think there's huge opportunity for developers and huge value in using them even if they only work e.g. 50% of the time or e.g. 80% of the use cases. Cutting these possibilities out completely, just because 100% is not possible, feels like handcuffing the developers.


So by tweaking the language and how existentials work, the “P does not conform to P” issue could even go away?

In the above example, adding associated type Area to protocol Shape, and then typealiasing Area to Double in the Circle struct would be an example of the kind of evolution that developer does to code, but it crashes on the PAT wall. And I don't see why that specific case could not be supported in the compiler in the future. Yes, it would be possible (in this particular case) to workaround it with func printShape<T: Shape>(shape: T). But why would the user have to work through trying to find the correct incantations to please the compiler, when compiler could just figure it out on its own? (EDIT: this exact case is ExistentialSpecializer, as pointed out by @Karl)

1 Like

We already have type erasure in the language today, via Any (i.e. you could have a function parameter or stored property typed as Any, and dynamically check if it's a String or Int), and it doesn't seem to be a problem -- people very much like the ability of Swift to statically enforce proper typing of data. The point of using protocols as types is to reduce the need for dynamic checks or strong coupling by allowing more precise forms of erasure.

in other words: if you can better express and work with your types in a generic/existential context, you'll be able to do more at that level of abstraction and shouldn't need to downcast to concrete types as often.

As for compiler optimisations, there is no intrinsic reason an existential must be less optimisable than a generic parameter. For example, we have a (relatively new) ExistentialSpecializer optimisation pass which transforms functions of the form func f(_: P) to func f<T:P>(_: T), at which point our other generic specialisation infrastructure can take over. swift/ExistentialSpecializer.cpp at a4103126b309549166182744265991c9a4db2819 · apple/swift · GitHub

Protocol existential self-conformance is a massive issue, though. I'll come back to this when I have some time to elaborate, but basically, I think it's a flaw in the design of the type-system. The type of an existential should be some kind of type-existential (i.e. "any type which conforms to P") rather than the protocol type P itself.

Opaque result types are a great grooming tool for API authors, but they wouldn't solve my most pressing need for beefed-up protocol existentials, which is that I sometimes need to store something which may be populated by a variety of different types (e.g. var myThing: Collection where Element == String). Generic parameters are awkward for this - if this was inside a struct, MyStruct<X> and MyStruct<Y> would have different layouts and could not be substituted.

1 Like

Yeah, passing an existential as an argument is isomorphic to passing a generic argument with its own type variable—anything that's possible with one representation is possible on the other. We've even discussed giving them the same underlying ABI.

In cases where you have a single existential value like this, it should be possible to automatically open the existential and pass along the value inside as an argument that's required to conform to the protocol, rather than requiring the existential itself to conform. Would that address your use case?

The trick is that protocols can only resiliently add new requirements, and if they do so, they need to provide defaults for existing conformances. An example when it comes to hashing would be the transition from hashValue to hash(into:)—we added the latter to support incremental hashing, but still provide a default implementation for types that only provided hashValue.

Using an existential does not require anything about the existential type conforming to the protocol. It's true that, since protocols can resiliently add new members, an existential based on a public resilient protocol would not "self-conform" unless explicitly promised to do so, when we add that feature. However, that's not because of associated types specifically but because of contravariant self/associated type requirements. If a protocol requirement takes a value of Self type or of an associated type as an argument, then there isn't any automatic way to generalize from the implementations for individual types to an implementation that works dynamically with heterogeneous types.

1 Like

I suppose it's too late for big ABI changes like that, but it would be cool if they were literally identical. The generics infrastructure seems far more fleshed-out (e.g. @_specialize would be cool for existentials, too).

The big win that we'd get from relaxing the discussed restrictions would be for heterogeneous storage. We need the existential boxing to give a uniform layout regardless of the contents, but if that heterogeneity involves PATs, currently we lose that information.

Opaque result types don't help with that problem - especially since the proposal involved them always resolving to a single type (e.g. func foo() -> opaque Sequence where Element == Int cannot return an Array<Int> on one path and a Set<Int> on another - just like you can't today).

Well, ABI only constrains the interface between separately-compiled, resilient components. We'll still have freedom to add optimizations within a binary. One nice optimization for returned existentials would be to be able to return them on-stack, avoiding heap allocation.