Lifting the "Self or associated type" constraint on existentials

Is this the same thing as causes this error? Protocol requirement cannot be satisfied by a non-final class because it uses 'Self' in a non-parameter, non-result type position because I would dearly like to see that fixed! It's very inconvenient to require consumers of these types of protocols to mark all their classes as final.

No, I think that's an inherent limitation required for the soundness of the type system.

IMO we should not “simply lift” this constraint without also addressing the concerns I've raised about “partial” protocol types:

I think the remedies I propose in that post would address all those concerns well enough to allow me to support going forward with this, but I'd want to have those remedies approved first.

3 Likes

I'm not sure we should make PAT-protocols usable as existentials at all unless all the associated types are specified; having only half an interface isn't that useful. (Note that you couldn't apply this to protocols with Self requirements, only associated types.)

I don't think that's a supportable argument. Collection has loads of useful methods even if you don't know its associated Index type.

5 Likes

This topic seems to have gained some traction across a couple of threads, so I felt like writing out my opinion on where I think existentials should go in Swift in one place.

I'd really like us to remove this restriction, but more broadly, I think we need to drastically cut ("focus") what you can do with existentials, deprecate certain uses, and revise the syntax.

Summary: Existentials have only 2 real use-cases: storage of an erased value, and inout function parameters which may assign the object to a different type. There is a reason both of these are related to type-flexible storage: boxing is a storage concept. Existentials are not generics, and they have no place offering pseudo-generic interfaces. As such, existentials as function input parameters should be deprecated, and replaced behind-the-scenes with generic type parameters. This naturally allows the use of protocol requirements with Self or associated types. The stripped-down existentials should be given a new name to more clearly say what they do.

EDIT: And one more use-case: intentional erasure to hide implementation details (i.e. function/property return types). Again, this only exists so clients know how to handle the value whose type and layout are unknown (e.g. do I need to retain it?); the value should be unboxed when you actually access its functionality.


What is an existential?

If I write a function which takes an existential argument:

func takesExistential(arg: MyProto)

Many developers might think this function is generic, but it isn't - its argument has a single, concrete type: namely, a box containing some other value that conforms to MyProto. That box knows how to forward some methods, but it's just a box - it doesn't (shouldn't) conform to MyProto. Conformance creates an "is a" relationship between two types: an Array "is a" Collection, but it would be conceptually inaccurate to say that the box "is a" MyProto, or the box "is a" Collection. It holds a thing that really does have an "is a" relation, but the box itself doesn't. A dog house isn't a dog.

The reason that developers might confuse this with a generic function is that Swift automatically boxes values for you. You can call this function and pass any conforming type as the argument, and there is no obvious difference between it and a bona-fida generic function:

protocol MyProto {
  func doSomething()
}
extension Array: MyProto { func doSomething() { ... } }
extension String: MyProto { func doSomething() { ... } }

func takesExistential(arg: MyProto) {
  arg.doSomething() // Looks like generic code.
}

takesExistential(arg: [1, 2, 3])
takesExistential(arg: "smells like generic code")

To really see the difference, it helps to examine what happens if you declare the parameter as inout: suddenly, the compiler seems unable or unwilling to do the boxing for you:

func takesMutableExistential(arg: inout MyProto) {
  arg = "weren't expecting me, were you?"
}

var array = [1, 2, 3]
takesMutableExistential(arg: &array) // Error (no automatic boxing).

var boxedArray: MyProto = array // Manual boxing.
takesMutableExistential(arg: &boxedArray) // Works.

By making the parameter inout, I don't just allow calling mutating methods (like replaceSubrange) on the contained value - I actually make it possible to swap that underlying value with something of an entirely different type. I'm passing a mutable box, not just a mutable value.

This illustrates a really important thing about existentials: when used as parameters, they only provide forwarding, not generics. Every time you interact with an existential, you are interacting with a box, not the contained object. This is the cause of all our current limitations on existentials.

The only time an existential is appropriate is when the type of its contained value is flexible. Whenever the existential is immutable, even if for a limited scope (such as when it is passed as a read-only argument to a function), its type is fixed, and we can unbox its underlying value as some local generic type.

Step 1: Automatically unbox existentials when calling generic functions

One of the major issues with existentials is this "self-conformance" issue: i.e. that because the box doesn't conform to the protocol (it only looks like that because it can forward function calls), it can't satisfy generic constraints of the form <T: MyProto>.

func takesExistential(arg: MyProto) {
  takesGeneric(arg: arg) // Error: existential 'MyProto' does not conform to 'MyProto'
}

func takesGeneric<T: MyProto>(arg: T) {
}

IMO, the solution for this is not to introduce additional magic conformances for the box. Adding self-conformances means the existential box could be passed in to generic functions directly (i.e. T.self == MyProto.self or something). This could mess up a lot of code which depends on a fixed set of conformances, as well as any code with fast-paths for known types (if T.self == Float.self, etc).

Instead, I believe we need to automatically unbox existentials to retrieve their contents, which do satisfy those constraints, and pass the type of whatever is inside the box as our T.

func takesExistential(arg: MyProto) {
  let unboxed: <T: MyProto> = unbox(arg)
  takesGeneric(arg: unboxed)
}

func takesGeneric<T: MyProto>(arg: T) {
}

My understanding is that this is doable: that at an ABI-level, unspecialised generic code really accepts a value pointer and witness table as parameters, and an existential is basically a value (or pointer to a value) plus a witness table.

The real difficulty is how to propagate type information outside of takesExistential. The unboxed generic parameter <T> only exists within that function, and any generic types we create using that type, like an Array<T>, don't make sense to code on the outside:

func takesExistential(arg: MyProto) -> ??? {
  let unboxed: <T: MyProto> = unbox(arg)
  return takesGeneric(arg: unboxed) // What is this type? An Array of... some type.
}

func takesGeneric<T: MyProto>(arg: T) -> Array<T> {
  return [arg]
}

I believe that this kind of situation, where one wants to propagate a generic type out of a function, is roughly what opaque types are supposed to do. So takesExistential could return an Array<some MyProto> - that is to say, an Array whose Elements all have the same, unboxed type, but we don't know what that type is beyond that it conforms to MyProto. I'm not sure if opaque types as they are implemented today can capture that, but we need some way to express the type I just described (an unknown, unboxed type which satisfies some bunch of constraints).

That is infinitely better than our current solution of Array<MyProto>, whose Elements are actually individual existential boxes and thus may be different underlying types (in our example, let mixed: Array<MyProto> = [1, "two", 3, "four"] is valid). Constraining these would allow things like conditional conformances, including to Equatable, Hashable and Codable, etc. They will need their own kind of box so they can be passed around by the parent code, but it will be the entire Array<some MyProto> in the box, rather than the elements being boxes.

Ultimately what this does is remove existential function arguments as a substitute for generic code. They're no good at it; that's why real generics exist. If we had some way to unbox them, or unboxed them automatically, we would always have a way to access the full API of an erased value - even if we could only reason about type relationships within a limited scope.

Step 2: Deprecate existential function arguments, replace them with generic functions under-the-hood.

Now that we have access to a value's full API through generics, we really don't have any need of read-only existential function parameters any more. Legacy code still has to be supported, but any future code like:

func takesExistential(arg: MyProto) {
}

Should trigger a warning and be automatically duplicated as generic functions, which is what modern compilers will generate calls to. Since existentials automatically get unboxed, this should be a source-compatible change.

Step 3: New spelling for existentials

Existentials now only serve 3 remaining purposes:

  • Function parameters where the function may reassign the value to an instance of a different type.
  • Storage (e.g. in instance variables) of unknown types.
  • Intentional erasure to hide implementation details

All of these are related to having type/layout-flexible storage, not to abstracting functionality, which I think is a good sign. We can introduce a new spelling to emphasise this - perhaps Any<MyProto>. These types would have no API; you can only create them and unbox their values. We might even consider making them invalid as read-only function parameters, since you'd have to unbox in to a generic scope to use the value anyway.

Step 4: New spelling for generics

The arguments are summarised well here: Improving the UI of generics, but our UI for generics could certainly use some improvement. With existentials no longer providing their pseudo-generic interfaces, it becomes more important to improve our generics syntax.

22 Likes

I don’t know whether I agree with all that, but I’m glad you wrote it. That certainly sheds a new light on some design options and gives a lot to think about.

5 Likes

There are valid use cases for collections of existential boxes with different underlying types. Are you suggesting to stop supporting them, or are you speaking about Array<some MyProto> as an additive change?

3 Likes

Having a category of types whose instances exist, but are never supposed to be passed to functions, would make programming with those types almost impossible. I would find it very difficult to accept that level of non-uniformity in a language design.

5 Likes

I do agree that we should allow implicit opening of existentials when they're passed as unique generic arguments to a generic function; I agree that will likely cover the majority of places you might want "self-conformance". I don't think banning existentials as function arguments is necessary, or really possible to begin with, since existential types still can be bound to generic type arguments. To work with an Array<Protocol>, Array's generic methods need to be able to work in terms of the Protocol type without boxing and unboxing in the unspecialized case. If existentials have to be syntactically decorated, and we had some Protocol syntax for generic arguments in addition to opaque generic returns, I don't think you need to go out of your way to ban existential arguments, since someone writing foo(x: Protocol) would at that point need to decide whether they mean foo(x: some Protocol) or foo(x: any Protocol).

13 Likes

I very much enjoyed reading what @Karl has to say here, but there's a larger issue with this sort of discussion I'd like to raise.

In this forum, when we talk about things like "box", "witness" and even "existential", we're actually discussing implementation details of the Swift compiler. These are not actual Swift language concepts.

If we can't explain parts of the Swift language (e.g. how protocols work) without appealing to implementation details, then there's something very wrong with our design — or at least our common ability to talk about our design. It leads us down the wrong path of coming up with explanations that we-who-dwell-in-the-forum accept, but which leave less-clued-in or newer users of Swift scratching their heads trying to understand how to get a grip on the language.

We either need to promote some of the implementation concepts to design concepts (which probably means giving them some recognizable syntactical form in the language itself), or adjust the design or documentation to be understandable without implementation details.

For example, "existential" might be something that needs to be promoted to a first class language concept. Other things like "box" and "witness" should probably stay as implementation details, and we should stop trying to explain things to people using them.

Yet other things, like the fact that Self-or-associated-type protocols are effectively different from unconstrained protocols, are probably controversial about whether they're design or implementation, but sorting this out would be a really good thing to do, IMO.

24 Likes

Existential types are described exactly as such in The Swift Programming Language, and are not an implementation detail:

Protocols as Types

Protocols don’t actually implement any functionality themselves. Nonetheless, you can use protocols as a fully fledged types in your code. Using a protocol as a type is sometimes called an existential type , which comes from the phrase “there exists a type T such that T conforms to the protocol”.

The term is also defined in Swift standard library documentation.

6 Likes

FWIW, this is (apparently) the entire discussion of "existential", and its implications are not entirely clear.

The first sentence (of the 3 you quoted) is false: protocols with default implementations do implement functionality.

The second sentence is inadequate: protocols (especially Self/associated-type) are hardly "fully fledged" compared to other types.

The third sentence introduces the term "existential" informally, then does nothing with it.

But the thrust of your point I agree with: existentials are probably something that we regard as part of the language, but I think we could be clearer about what that means. (If we have thread after thread here about existentials, one vaguely worded sentence in the documentation might not be enough for the real world.)

6 Likes

No protocol has default implementations in its definition. Some have default implementations extended to them. (Technically correct is the best kind of correct?)

This may be a distinction without a difference. :slight_smile:

Default implementations are defined in extensions for (basically) historical reasons. There have been multiple threads about allowing default implementations in the original protocol declaration, but no one's come up with a good syntax for that yet.

Conversely, if we extend a struct with new methods or properties, we don't say that the extensions are not really part of the struct.

Anyway, my point was a bit more hand-wavy: this aspect of the documentation is pretty light.

The first sentence is very much true. Protocols do not implement any functionality themselves. This is easy to see and a crucial point which distinguishes them from base classes: if a protocol has no conforming types, then whatever code you write to implement its requirements can never be invoked. On their own, protocols cannot offer any functionality.

The second sentence is very much adequate. Existential types are fully-fledged types just as much as any other non-nominal type (such as tuples). Like any other non-nominal type, they cannot be created using init() syntax or extended; none of this is unique to existential types.

The third sentence is not the end of the section; the remainder of the section demonstrates how to use an existential type in your code.

1 Like

One could very reasonably argue that these limitations are what make existentials, tuples, and functions not “fully-fledged”, and that the documentation is wrong in this regard.

You won't catch me kicking my baby bird out of the nest before its (left: Wing, right: Wing) can conform to the Flyable protocol.

5 Likes

Protocol types are idiosyncratic in that the body of the declaration of the type does not describe the operations that can be performed on variables of that static type. This is pretty weird, and I'm not sure it's what most people expect when they read that protocols are fully fledged types, since no other nominal types work this way.

This idiosyncracy is why I think it would be wise to only allow those protocols to be used as types bare whose bodies' only contain definitions that could be used with all variables of that protocol type, i.e. those without associatedtype, contravariant Self, static or init requirements.

1 Like

What I'm trying to achieve with that idea is to stop existentials being used to write pseudo-generic code, not to hinder real generic code which just happens to be passed a box. I'd be okay with softening that position and simply stripping such functions of their distinctive meaning :innocent: In other words, we would lower read-only existential arguments in to unique generic parameters. inout parameters and return types would remain unchanged and continue to be passed around in boxes.

So the function:

func myFunction(_ x: MyProto, _ y: MyProto)

Would be exactly the same as writing:

func myFunction<T0, T1>(_ x: T0, _ y: T1) where T0: MyProto, T1: MyProto

For these functions, the use of any Self or associated-type requirements becomes a non-issue. The user has chosen not to define the names T0 or T1 in the first style, but they "exist", the compiler sees them, and they could be surfaced in the language as opaque types in lieu of a user-specified name.

This doesn't "ban" existentials being used as function inputs, but it entirely drops their old meaning and replaces it with something else. Instead of describing a function which accepts a box, it becomes a shorthand for a certain signature pattern of a real generic function. This doesn't remove our broader need to revisit the generics syntax; I see it as patching the "expected" meaning in to the existing syntax.


inout parameters are kind of interesting. We can't promote function arguments of type inout MyProto because that would change their meaning in a way that breaks code -- they must stay as mutable boxes.

But they could be unboxed within the function's scope, meaning they could still access Self and associated types:

// What I write:

func myFunction(_ arg: inout RangeReplaceableCollection where Element == Int) {

  if arg.startIndex != arg.endIndex {
    arg.append(42)
  }
  if Bool.random() {
    arg = Array(0..<10)
  }
}

// What it gets lowered as:

// Note: not a generic type. This is still a box.
func myFunction(_ arg: inout RangeReplaceableCollection where Element == Int) {
  // Type of 'arg' fixed for any code here, allowing 'Self', assoc types.
  unboxMutable(&arg) { <T: RangeReplaceableCollection...>(unboxedArg: inout T) in
    if unboxedArg.startIndex != unboxedArg.endIndex {
      unboxedArg.append(42)
    }
  }
  if Bool.random() {
    arg = Array(0..<10) // Type may get reassigned later, though.
  }
  // Type of 'arg' similarly fixed for any code here, allowing 'Self', assoc types.
  // 'T' from the first scope is not meaningful here.
}

This still leaves us in a somewhat-awkward state. Using the "existential spelling" to write your generic functions is sometimes a convenient shorthand, but has this performance-degrading edge case for inout arguments (the inout existential is a little more capable than a generic parameter, but it is no less capable). It's easy to get out of it by introducing an angle-bracketed generic type, but it's yet another thing developers need to know about.

That's why I say we should:

  • Introduce a new spelling for the times you really want boxing (I'd even go for var x: boxed Collection where...).
  • Migrate both old generics and existential-spelling generics to a new, more concise syntax
  • Deprecate the old existential syntax

Which I don't think should introduce any major problems (beyond annoying people that the syntax has changed).

1 Like

It would not be a very useful definition of “fully fledged” in this context to make it mean that it has every feature we could reasonably expect.

Structs and enums can’t have custom subtyping relationships like classes can, even though they’re essential to the usability of types like Optional. Perhaps structs and enums aren’t fully fledged types either?

Classes can’t inherit from multiple superclasses or declare abstract requirements, unlike what’s possible when working with protocol-oriented programming. They can’t even override singly inherited methods from superclasses that rely on default implementations from a protocol. Perhaps classes aren’t fully fledged types either?

Yikes, all of a sudden, Swift has no fully fledged types. This is not a useful description for the didactic purposes of TSPL.

The meaning of the statement can only be this in context: Take all the types you’ve learned about up to this point in the textbook. Existential types can be used in all scenarios where you could use any of those other types.

Terms of Service

Privacy Policy

Cookie Policy