Lifting the "Self or associated type" constraint on existentials

Torust · November 25, 2018, 11:35pm

I'm not convinced that this is as bad an option as you see it to be. In my eyes, it moves any errors or issues closer to the places where they're relevant and therefore easily explained. For example, rather than being presented with a message that a protocol can't be used as an existential, you're instead presented with a message that a particular method can't be called since the type is unknown (or else is erased to Any where possible). That then naturally leads to imposing more specific constraints (i.e. the <T : Protocol> syntax) to get to a point where that information is available.

Fundamentally, this proposal just shifts the error closer to the actual problem (i.e. you don't know what the type is) while enabling use-cases that currently require cumbersome workarounds.

xwu · November 26, 2018, 2:09am

I would agree with @dabrahams that such an eventuality would be strictly, and significantly, worse than the status quo for the reasons below. This is why I said upthread that lifting the "Self or associated type" restriction should happen only in tandem with significantly improved diagnostics that allow users to avoid the scenario you outline above.

Certainly there are some "use cases that currently require cumbersome workarounds" that would be enabled by lifting the restriction, and I would very much like to be able to enjoy that functionality. However, @dabrahams outlines above why lifting the "Self or associated type" restriction won't actually enable or enhance a large portion of use cases that people have mentioned even in this thread, such as an existential collection type replacing AnyCollection.

What we often see in the "Using Swift" portion of these forums is that users reach for existential types when they should be using generic contraints--"should" not merely for performance reasons, but because they truly do not need or intend for any type erasure and often do intend to access APIs that require the type relationships being erased. That they run into the "Self or associated type" restriction now and would run into the "method can't be called" issue in the future is not the actual problem but only a symptom of that problem (i.e., using existential types instead of generic constraints).

Today, users are told upfront of this fact if they are dealing with a protocol with Self or associated type constraints. Without the "Self or associated type" restriction, then, more uses of existential types by the typical user would fall into the category of problems that would be best served by features other than existential types. This becomes even more so the case if/when opaque types and other enhancements are added to the language. Given the limited extent to which intentional use cases would actually be enabled by lifting this restriction, one must be careful that it's not outweighed dramatically by the extent to which unintentional use cases would be encouraged--and "unintentional" here referring not to the intentions of language designers but to the intentions of the user who actually does not want or might not even know about the type erasure that's going on.

One component of solving this problem might be to change the spelling so that Any (or, for reasons that will become apparent below, I'll use an alternative strawman syntax Existential) rather than P is the existential type. The goal here is to reduce as much as possible the scenario where users reach for existential types without even realizing that they are doing so. I have to admit that, even after years of working with the language, I still catch myself sometimes unintentionally using an existential type when I meant to have a generic constraint!

A spelling such as Existential neatly avoids the baffling situation that "P does not conform to P," since even on visual inspection it's clear that Foo<Bar> has no reason to conform automatically to Bar.

I'd imagine it could then be possible for authors to conform Existential to P by manually implementing the necessary methods in an extension (i.e., extension Existential where Protocol == P). (If the existential type were to be spelled Any, then extension Any where Protocol == P would naturally prompt the question of whether one can extend Any without constraints, which is a different topic altogether best avoided here.)

John_McCall · November 26, 2018, 2:40am

An initializer or static method requirement doesn’t technically preclude a protocol type from self-conforming as long as there’s at least one conforming type: the protocol could just pick that type and construct it / call the method on it. But if that type isn’t unique (which is reasonable to assume a priori), picking one type in specific would be an arbitrary choice, so as a policy matter it doesn’t make sense to allow it. So sure, maybe with an annotation it could be done if there’s really a reasonable default that wouldn’t cause more confusion than it saved.

Don’t think about it in terms of a function that takes an actual value of the protocol type. Think of a generic function over T: P. What actually happens if you use a particular requirement when T is dynamically the protocol type P itself?

anandabits · November 26, 2018, 2:40am

I haven't had time to properly digest and contemplate the argument @dabrahams is making yet so nothing I say here should be considered as a direct response to that. However I do want to point out now that the statement above is simply not true.

Lifting the restriction would significantly simplify designs that store type-erase values and use various dispatching strategies to interact with the existential. The current workarounds I'm aware of rely on introducing an additional protocol which can be used as an existential and dispatching through that. Lifting the restriction would allow storage, casting and dispatching to happen directly on the PAT protocol itself which would streamline designs significantly. I have kept this pitch in mind since it began and have already run into several use cases where it would be extremely handy.

I don't have an opinion on this syntactic change yet but I don't buy the argument that it will reduce accidental use of existentials. The reason users often reach for existentials is because many programmers are most familiar with Objective-C protocols or interfaces from other object-oriented languages. The will reach for a tool that feels familiar in this way regardless of the syntax used to invoke that tool.

Allowing extensions on the existential would be a really useful way of allowing existentials to conform to protocols (including their defining protocol). On the other hand, allowing extensions on existentials could introduce significant confusion between those and protocol extensions. I think we need to study the use cases and consider alternative solutions closely before heading too far down that path.

xwu · November 26, 2018, 3:47am

Yes, indeed, I too am very excited about the fact that lifting this restriction would significantly simplify designs that store type-erased values. However, that does not change the fact that many uses discussed above do not fall into the category of things that would be simplified by lifting the "Self or associated type" restriction. Just quickly scrolling through some of the items that people mentioned here:

@karim mentioned Equatable conformance for existential types: lifting the restriction would not allow that
@dmcyk mentioned not having to create type-erased boxes: lifting the restriction would not allow that (for reasons @dabrahams outlines above)
@Karl mentioned replacing AnyHashable with a thin wrapper around Hashable: lifting the restriction would not allow that
@rbishop mentioned self-conformance for existential types: lifting the restriction would not allow that

dabrahams · November 26, 2018, 4:40am

That's a nice positive story to tell ourselves about the potential outcome, and I might even be inclined to believe it if we had a list of actual use-cases that would demonstrably become much nicer (hint, hint).

The problem is, if it just (as you say) “shifts the error”, then it necessarily doesn't remove the fundamental source of confusion around protocols and their existential types. If what @xwu says is true, that most people run into this wall in places where even a generalized existential would be a poor fit for the use case, then the wall—as frustrating as it might be—is actually performing a valuable service.

dabrahams · November 26, 2018, 5:18am

Although @John_McCall described a special case where the init() can reasonably be satisfied, and ways, where—usually—it can be satisfied arbitrarily (but IMO unreasonably), in the general case, IIUC when no types conform to the protocol, it can't be satisfied.

Anyway, although init() is simple to understand it might not be as much of a killer example as func f(_: A) -> A.

If your example would be enhanced by this proposal it would be very instructive to see how the code could be improved were the proposal accepted. I note, however, that the technique you showed is generally useful even where there are no existentials; I use it that way to deal with heterogeneous “collections” of similar items without type erasure, and I'm pretty certain my use case would see no benefit from generalizing existentials. I point that out because I think if it can be much better there's probably a more general feature that would benefit both of us.

If we are to believe all the grumbling we've heard about angle brackets, changing the syntax could easily be enough to make existentials not “feel familiar.” But doing that alone strikes me as a strictly punitive approach that I'd like to avoid. I would like to also address the fundamental confusion, increasing expressivity for protocol authors and comprehensibility and predictability for protocol users.

gwendal.roue · November 26, 2018, 6:32am

I'm keen to agree with @dabrahams here. The way I understand it, shifting the error can be worse than the status quo because the amount of refactoring that has to be performed in order to remove the error can be dramatically bigger.

dmcyk · November 26, 2018, 7:15am

Lifting the constraint wouldn’t indeed help to avoid creating type erased boxes, but it could somewhat simplify it when working with members that don’t use Self/associated types.
e.g. (primary for being able to mock things) I often write such wrappers:

protocol Foo {
  associatedtype Bar

  var x: Int { get }
  var y: Int { get }
}

struct AnyFoo<T>: Foo {
  typealias Bar = T
  private let _getX: () -> Int
  private let _getY: () -> Int

  var x: Int { return _getX() }
  var y: Int { return _getY() }

  init<K: Foo>(_ val: K) where K.Bar == T {
    self._getX = { return val.x }
    self._getY = { return val.y }
  }
}

Being able to use simple type members would decrease memory footprint of such wrapper and simplify the code.

michelf · November 26, 2018, 11:58am

Technically, is there a reason func f(_: A) -> A couldn't be mapped to func f(_: Any) -> Any with a runtime trap if the argument isn't of the expected type? By itself this isn't a very satisfactory solution, but I think it's close to what people would expect to happen. And then maybe there's way to improve on that by making the runtime trap clearly visible in the code like in ex.f(a as! ex.A) or something like that.

Joe_Groff · November 26, 2018, 4:12pm

dabrahams:

I loathe writing type-erasing wrappers as much as the next guy. But generalized existentials don't eliminate the need to write them, nor (AFAICS) do they make it much easier to do so. An existential Collection type would not remotely replace AnyCollection , because that type would not conform to Collection . In fact, most basic things we expect from a Collection would not be available: you can't index it, and first() could at best return Any? .

Nobody has proposed a set of features really targeted at the pain points you cite; that would be a really interesting exploration (I suspect the answer has at least as much to do with solving what I call “the API forwarding problem” as it does with existentials). Even if what is being proposed makes incremental progress toward solving those problems, though, I fear that in the specific place we arrive, having applied that increment, we'll have done more harm than good as explained in my opening message.

Although it's true that these fundamental design challenges exists, associated types are the wrong thing to blame for them—it's contravariant requirements that pose these challenges. It seems like a major problem to me that the existing restriction misplaces the blame for the complications. I'm all for deemphasizing type erasure, I absolutely agree that there are better alternatives in most situations and it was a mistake to spell existential types as the bare protocol name. Beyond making incremental progress toward the goal of generalized existentials, I'm more concerned that the state we're in now is actively harmful, and it's also threatening to damage the language design in other areas, such as protocol resilience, if we choose to stay where we are.

Joe_Groff · November 26, 2018, 4:19pm

IMO, the general thing to support for existentials for which self-conformance is desirable, but there isn't a natural covariant generalization for the protocol requirements, is to allow the existential type to be extended with an explicit conformance. This would be necessary anyway for resilient public protocols that want self-conforming existentials since they would need to promise that they will remain self-conforming if they add new requirements. Then you could think of a covariant existential's conforming as being an automatic derivation rule, while still allowing the automatic derivation to be overridden. For example, for Hashable:

// Straw syntax `Any<P>` for the explicit existential type
extension Any<Hashable>: Hashable {
  static func ==(l: Any<Hashable>, r: Any<Hashable>) -> Bool {
    return AnyHashable(l) == AnyHashable(r)
  }

  func hash(into: inout Hasher) {
    AnyHashable(self).hash(into: &into)
  }
}

However, I think the more commonly useful thing would be to allow for implicit opening of existentials when passed as generic arguments; that strikes me as more likely to be what you mean when passing a single existential value into a generic function, and would be more efficient as well.

Moximillian · November 26, 2018, 5:00pm

Has there been any consideration to allow a subset of protocol-as-existential parameter usage for those use cases where it would be possible to produce a transform to generics (in same lines as ExistentialSpecializer)?

If existentials have all the problems and generics don't, then as long as the syntax can be transformed to generics, there's no problem, right? So for example allowing to use associated types or Self, when there is transform available. So func f(_: P) would work right-away, but func f(_: P, _: P) would produce error like "Two or more parameters using protocol existentials is not supported, use generics instead" and then maybe a fixit to guide along the right direction.

Joe_Groff · November 26, 2018, 5:03pm

There's nothing wrong with writing f(_: P, _: P) though; that's like writing f<T: P, U: P>(_: T, _: U). There's unlikely to be much additional optimization had from constraining both parameters to the same type if the arguments being the same type doesn't matter to the function implementation.

dabrahams · November 26, 2018, 5:32pm

Just to be clear, I meant in the case where being the same type does matter, the version with existentials will have to pay for the dynamic check (and also compromise static type safety) while the generic one will not. That's the cost of prematurely erasing type information.

Joe_Groff · November 26, 2018, 5:34pm

As far as the diagnostic quality issues @xwu and others brought up, I think we could at least do the following:

The historic code for dealing with existentials simply elided unusable requirements from name lookup on the existential type. Now that we have the availability checking infrastructure, we could instead treat these members as unavailable, so that you get a sensible diagnostic as to why the method can't be used.
For the common case where someone wrote an existential as an argument, we could offer a fixit to turn it into a generic parameter as part of the availability diagnostic.

Joe_Groff · November 26, 2018, 5:44pm

It should be obvious that that's the case, though, since you won't be able to just use any operation that relies on the types being the same without some sort of check. The lowest-energy path ought to be making the types the same in the signature if that's what's desired.

dabrahams · November 26, 2018, 5:56pm

I 100% agree that associated types are not “to blame”—that's why I've been saying we shouldn't merely lift the restriction without imposing some others. One of the major problems with existentials is that you can easily fall off a cliff where the existential type no longer exists/has an API matching the declared protocol. Although you've proposed to remove the first problem, you'd make the second problem worse, and I think it's easily as bad as the first problem. If one had to explicitly declare:

whether the existential should conform to the protocol
whether the existential's API must match that of the protocol
in the case the existential's API needn't match that of the protocol, which parts of the protocol's API should be available on the existential

and the compiler would forbid requirements and extensions that violate those declarations, then we'd have a sane world where “blame” was always correctly attributed.

In at least the second and third cases, I think to avoid confusion the existential should not have the same spelling as the unadorned protocol name.

Joe_Groff · November 26, 2018, 6:09pm

I don't think this makes anything worse regarding any of your three points. Right now, no existential conforms to its protocol (discounting AnyObject or some @objc protocols), and that doesn't change. To be resilient, it will have to be opt-in when we do support it, addressing #1. #2 seems like the same point as #1 to me; maybe you can clarify the distinction to me. As for #3, even with the existing restriction, thanks to protocol extensions, it's already the case that an existential's API can diverge from the conforming type's. In the fullness of time, when we support opening existentials, I also think that distinction will disappear—fundamentally, the only operation an existential supports is opening it and manipulating the underlying dynamic value; once the value is opened, the entire protocol API is available relative to the dynamic type of that value. Like I said, I'm all for making existentials more explicit.

anandabits · November 26, 2018, 6:27pm

I appreciate the argument you’re making and am still thinking about it, but...

In a language with fully generalized existentials and protocol extensions in any module it seems like declaring this fully would get complex and verbose pretty quickly. Remember that there is not just one existential, but for many protocols an arbitrary number of them with different constraints.

This approach also seems likely to lead to unnecessarily limited existentials - APIs that are perfectly valid to expose given the constraints on the existential that are not exposed because the declaration making them available is not present (and depending on details, maybe cannot be added retroactively by a 3rd party).

Overall, my instinct is that while this model might be more explicit about what is going on I think it is unlikely to increase clarity. It introduces distinctions that add complexity to the semantics as well as well as the surface of the language. I’m not sure this would be a net gain for anyone and could be a net loss, especially if it makes the language feel verbose or finicky.

It seems to me that there is an analogy with type inference here. It can be confusing at times but overall it is a huge net win to let the compiler do some work for us. Clarity is usually improved increased by relying on inference. I think the same is likely to be true in the case of inferring the API that is available on an existential, at least in most cases.

However, it is worth noticing that in the case of type inference we always have the option to be explicit by adding type annotations. I wonder if a similar model of optional annotations might be useful to add clarity of intent as well as get more localized and precise feedback from the compiler when that intent doesn’t type check.