Generalized opaque and existential type constraints

Okay, I was just thinking in terms of function declarations, but this helped me understand that there's a larger picture here. If I understand correctly, there's a desire to build a syntax for declaring constraints on opaque and existential types throughout the language. For the sake of making sure I understand that desire, I'd like to sketch out a few scenarios to make sure I understand.

(I'll use the some Collection C syntax as a strawman for now, acknowledging that arguments have been made against it, but I need to use some syntax to illustrate the ideas.)

// Given this:
protocol Strawman {
  associatedtype Input
}

We need a syntax that can be used in all of the following scenarios:

  • Function parameters

    func example1(_ x: some Strawman A)
      where A.Input == Int
    
    func example2(_ x: any Strawman A)
      where A.Input == Int
    
  • Return types

    func example3() -> some Strawman A  
      where A.Input == Int
    
    func example4() -> any Strawman A  
      where A.Input == Int
    
  • Stored properties

    struct Example5 {
      var x: any Strawman A where A.Input == Int
    }
    
    struct Example6 {
      var x: some Strawman A where A.Input == Int
    }
    

    (There was some question whether supporting Example6 is a desirable feature; see Holly's post here. I'm just showing how it might be spelled if we did want to support it.)

  • Structural return types

    func example7() -> (some Strawman A, some Strawman B)
      where A.Input == B.Input 
    
    func example8() -> (any Strawman A, any Strawman B)
      where A.Input == B.Input
    
  • Are there other cases I missed?

As mentioned before, the goal of any syntax proposed here is not to replace the existing <T> syntax; rather, we're just trying to find a syntax that allows us to put constraints on some and any types. Some of the above examples could be written today using regular generics (specifically, examples 1 and maybe 6), but most of the above are currently inexpressible. Likewise, it is not a requirement that everything you can do with generics be expressible via constraints on opaque types.

I'll also note that the "primary associated types" proposal could simplify many of the examples here, but does not replace the need for a general syntax, because a protocol can have many associated types which are not a primary associated type.


With all of that written out, I'd like some clarification. Many of the suggestions earlier in this thread talk only about opaque result types, but those don't seem like they'd work with existential types. For example, using the "angle brackets after the arrow" syntax, how would you write something that returns a constrained existential type, like example4 above?

func whatDoWeDoHere() -> <T> any Strawman where ???

Likewise, what would a constrained existential member look like, especially in the case where there may be a naming conflict with an external struct?

struct S<Input> {
  var x: any Strawman A where A.Input == Input

  // If we don't support naming `any Strawman`, what can you
  // reference in a where clause?
  var y: any Strawman where ???
}

It's not clear to me that any other syntax has been proposed which can handle all the use cases suggested above.

1 Like

I don't think it's possible, or even desirable, to define a single syntax that could work both in a context where the where clause clearly refers to, and only to, a specific parameter, and in a context where where introduces a list of constraints that refer to several parameters at once.

Your examples 5 and 6 show a case where the where clause unambiguously refers to a specific parameter, so there's no real need to name it:

struct AltExample5 {
  var x: any Strawman where Input == Int
}

struct AltExample6 {
  var x: some Strawman where Input == Int
}

this breaks if we add structure, for example in case of tuples or some other type with generic parameters:

struct AltAltExample5 {
  var x: (any Strawman, any Foo) where Input == Int // ambiguous
}

struct AltAltExample6 {
  var x: Result<some Strawman, some Foo> where Input == Int  // ambiguous
}

which means, to me, that we could introduce named parameters only if needed, that is, to remove ambiguity when it presents itself. The compiler could help here, with an error that clearly suggests to introduce named parameters when needed.

In case of functions, the situation seems very similar, with the only difference that having more parameters, thus potential ambiguity, is simply more likely to happen. In theory, your first 2 examples could work without named parameters:

func example1(_ x: some Strawman)
  where Input == Int

func example2(_ x: any Strawman)
  where Input == Int

but once we add more parameters or return types, either because with both have input and output generics, or due to "structural" opaque return types, the only way to resolve ambiguity would be to name them, unless we attach a where clause to every single parameter for which additional constraints are declared.

But naming parameters is really only needed if we want to:

  • put together constraints for multiple type parameters in a single place;
  • cross-reference type parameters when declaring constraints;

If we decide for a smaller, simpler goal, and leave "total generality" to explicit declaration of type parameters in angle brackets, we can think about ways to attach additional constraints to some and any type parameters directly, without a detached where clause.

For example, some Strawman<.Input == Int> has beed proposed several times, but I'm not a fan of it for reasons I laid out above. some Strawman(.Input == Int) could be interesting, also:

  • some Strawman(Input == Int);
  • some Strawman(where Input == Int);
  • (some Strawman where Input == Int).

Sure, if a name is not needed, it could perhaps be elided. I'll note, though, that even when there's only a single some/any, the where clause could still be ambiguous if it could reference an enclosing context:

struct ChemistryExample<Element> {
  // Is 'Element' here Collection.Element or ChemistryExample.Element?
  var x: any Collection where Element == String
}

Eliding the name where not needed may be acceptable, but it creates another "stopping point" along the continuum, which means it's another syntax the user needs to learn. I'll grant that it's a fairly intuitive one, so that may be fine.

Part of my argument here is that explicit declaration of type parameters in angle brackets does not give total generality, especially in the case of existential constraints… or least, I haven't seen anyone demonstrate how it would. How would you declare something like my example8 using angle brackets? Or how would you do something like this?

func doThisWithAngleBracketsPlease(x: any Publisher A, y: any Publisher B)
  where A.Input == B.Input, A.Output == B.Output

The conversion to generics would be fairly straightforward if these we're using opaque some types, but how would you do it while accepting existential any types?

Or how would you do this?

struct S<Input> {
  func moreAngleBracketsPlease() -> (any Publisher A, any Publisher B)
    where A.Input == Input,
          B.Input == Input,
          A.Output == B.Output,
}}

I just don't see how adding angle brackets is even supposed to solve this kind of problem.

(Edit: I adjusted my code examples slightly for clarity)

1 Like

Swift actually still has the remnants of an older syntax for existentials, which uses angle brackets: protocol<…>. You can see it in definition for Any as a typealias for protocol<>.

Since it’s actually a currently available syntax (very deprecated but still understood by the compiler), it could be resurrected and expanded for this use case.

That's true. Do you have any thoughts on what that might look like? I'm not seeing how protocol<> would actually help here, other than "it has angle brackets". Specifically, how would you add a constraint between multiple types, as in this example?

I can't see how you'd do it in this form…

// Where do I put `A.Input == B.Input`?
func attempt1(x: protocol<Publisher>, y: protocol<Publisher>)

Maybe something like this?

func attempt2<T, U>(x: T, y: U) 
  where T: protocol<Publisher>,
        U: protocol<Publisher>,
        T.Input == U.Input,
        T.Output == U.Output

But now we're still not getting anything from the angle brackets; it would probably be clearer to write it like this:

func attempt3<T, U>(x: T, y: U) 
  where T: any Publisher,
        U: any Publisher,
        T.Input == U.Input,
        T.Output == U.Output

Note that I'm using T: any Publisher, but that's kinda odd, because we aren't passing in a subtype of any Publisher; we'd always be passing in exactly any Publisher; we just have some additional restrictions on it. So maybe it should be T == any Publisher there. But in either case, is this a desirable direction? It still doesn't support constraints on return types, so it seems to lack generality.

In addition to Holly's post, I tried to lay out the larger picture here.

2 Likes

I did see that, but I think it took me a while to really digest it. I did have some questions.

(Emphasis added.)

This seems to suggest that adding a where clause here is not what you're hoping to achieve. The paragraph is specifically in the context of proposing the some Collection<Int> syntax, though, and I think what you were saying is not that where clauses in general are bad, but rather that adding the lightweight same-type syntax proposed there makes generics and same-type constraints feel more unified. Is that correct?

As you said later, the same-type constraint syntax handles case #2, and this thread (I believe) is more about case #1. I wanted to clarify, though. You said that we want "a fully general syntax that can express any constraint that a generic signature with a single type parameter could". I'm not sure I entirely understand what you mean there. Does that mean that we should not be focusing on cases where we need to constrain multiple parameters to have the same types? For example, I've used examples like this quite a bit:

func example(any Strawman A, any Strawman B)
  where A.Input == B.Input

There are two different existentials here, and we're constraining them to each other. It's seems desirable to me to be able to express this kind of constraint. But this kind of constraint cannot be expressed by a generic signature at all right now. Am I tilting at the wrong windmills in this thread by trying to address this use case? This proposed syntax here does seem to address "Generalized opaque and existential type constraints" as per the title of the thread, but does not specifically align them with generics.

I'm not sure how you would actually construct two values to pass to this example() function here. It seems like it might be difficult for the type checker to reason about this kind of code.

Assume the same-type constraint syntax is accepted, would this do it?

func example(any Collection A, any Collection B)
  where A.Element == B.Element { ... }

func takeInts(a: any Collection<Int>, b: any Collection<Int>) {
  example(a, b) // Marker
}

I would assume that at the marker, the type checker can see that both parameters have Element == Int.

1 Like

Yess. I'm saying that adding a fully general where-like syntax doesn't by itself achieve our goals, because that syntax will necessarily look different from simple generic arguments, and so it will fail at the critical goal of establishing stronger ties between the features.

I mean that you can't meaningfully express those things in something that's contained within a single type. That sort of link ought to require something at a wider scope. Your example has a where clause that applies across the entire function signature, for example. Or you might express it like this (using one particular syntax that I know has been proposed, without meaning to imply anything about my own preferences):

func example<T>(any Strawman<.Input == T>, any Strawman<.Input == T>)

Unless I'm mistaken, A and B here are the same type (i.e., identical existential boxes).

If we had, say, a DictionaryProtocol<Key, Value>, then I could make sense perhaps of a function that takes two values of distinct types A and B where, say, A: DictionaryProtocol, B: DictionaryProtocol, A.Key == B.Key, A.Value == String, B.Value == Int. But since these aren't all constrained within a single type, existing features can be composed to express this:

func f<K>(_: any DictionaryProtocol<K, String>, _: any DictionaryProtocol<K, Int>)

You're right; those are likely the same type. A better example would be something like this:

func example(a: any Publisher A, b: any Publisher B)
  where A.Output == B.Input

If we accept the "light-weight same-type constraint" syntax pitched elsewhere, and if Publisher adopted it, then this could be expressed like so:

func example<T, U, V>(a: any Publisher<T, U>, b: any Publisher<U, V>)

T and V are somewhat distracting here, though, if we actually don't care about the types. So maybe it could be expressed like this:

func example2<U>(a: any Publisher<_, U>, b: any Publisher<U, _>)

But again, this only works for protocols that choose to adopt the "primary associated types" pitch that allows generic parameters to follow a protocol. If we want this to work with protocols that don't adopt that, then we need some way of attaching additional constraints to individual parameters. As far as I can see, that either requires attaching the constraints directly to the type (e.g. any Publisher<.Input = T>) or it requires being able to reference them in a where clause, and doing that pretty much requires that we be able to assign names to the various types.

Having expressed these constraints, what would you proceed to do with these types or values? I’m having trouble imagining how to work with this.

I don't have a specific use case here; I'm just following from the expressed desire to have "generalized" constraints.

In this particular case, my example doesn't even make sense, because I was incorrectly thinking that Publishers had both an Input and an Output associatedtype. (They actually only have an Output type.) But if they did have both an input and output, then a simple toy example would be something like this:

protocol Pipe: Subscriber, Publisher {}

func subscribeAndLogAboutIt<T>(a: any Pipe<_, T>, b: any Pipe<T, _>) {
  a.subscribe(b)
  print("We're subscribing \(b) to \(a)")
}

Again, I'm not trying to claim that this particular use case is example in the real world; I'm just trying to extrapolate from the goals that I've seen expressed:

  • We want to be able to express constraints on the associated types of both opaque and existential types
  • Those constraints should be "generalized"—that is, they should be usable anywhere such a type is used.
  • The same constraint syntax should be applicable to both opaque types and existential types. That means that changing any to some (or vice versa) should still yield valid syntax.

If I've accurately understood the goals, then it seems like we should be able to express constraints like those I've used in the example here.

You'd want to use some Pipe<_, T> and some Pipe<T, _> instead here, no? As written, this would require implicitly opening the existential anyway before passing b as an argument.

We shouldn't be designing specifically to allow the expression of constraints that aren't usable: in fact, it'd be better if the language didn't open up possibilities that lead users to dead ends. Hence my question whether these constraints you want to express can actually be used.

True, you'd need to implicitly open at least one. That has been proposed, though I agree that accepting some Pipe is probably semantically better.

But still, this should work, since I don't think the first parameter needs to be opened

func example<T>(a: any Pipe<_, T>, b: some Pipe<T, _>)

If the argument is that this functionality is not necessary… that's fine, we can examine that argument. But if so, then what is the outcome of that argument? Are you suggesting that it's not necessary to be able to place constraints on existentials? Or that it's not necessary to be able to establish constraints between multiple function parameters?

If we can achieve that without introducing unnecessary contortions to the language, that's fine. But, for example, we allow users to write this:

func useless(x: Never) { }

even though that function can never be called. We allow this because Never is useful as a return type, and it would introduce unnecessary complexity to the language to try to disallow it in parameter position. Instead, we promote the idea that "Never can be treated like any other type", and that makes it easier for users to understand how to use it.

Likewise, I think we should make our constraints applicable everywhere, even in cases that produce unusable call sites. Doing otherwise increases the complexity of learning how to express these constraints, because users not only have to learn the syntax of how to express them, but they also have to learn the exceptions of where that syntax is not allowed.

So if my examples should not be allowed, is there a consistent principle we can set out that underlies that decision such that constraints are still easy to learn?

How would a caller obtain a value to pass as a, other than a concrete value which would then be boxed for no reason? I suppose you could have another function which returns a result boxed as any Pipe<_, T>, but why would that function choose to box the value rather than return an opaque type?

My point is that the expressibility of unusable constraints doesn't need to be considered as part of the generalization exercise here. I'm not proposing to create exceptions in the language rules, just that we do not choose new syntax based on what can accommodate such constraints. For instance—if indeed the sort of constraints you describe are unusable—the critique that some proposed spellings can't be used to express them not only shouldn't be a blocker, it'd be actively desirable.

If there isn't a plausible use case for the sort of constraints that you describe, then the (undoubtedly significant) work to support them wouldn't be implemented in the compiler. It would be a con, not a pro, to have a "valid" syntax that users reach for to express a constraint which then the compiler still has to reject for lack of an underlying implementation. For the end user, a "valid" syntax that doesn't compile is not much different from an "invalid" syntax that doesn't compile—and surely both are inferior to a scenario where that which isn't supported also isn't expressible.

1 Like

It might be a stored property in some type. That's the most common use case for any types, in my experience.

func connectPipesAndLog<T>(a: any Pipe<.Output = T>, 
                           b: some Pipe<.Input = T>) 
{
  print("We are connecting \(a) -> \(b)")
  a.subscribe(b)
}

class CaseChangingPipeHolder {
  // This is an 'any' because we allow clients to 
  // replace the input pipe at any point.
  var input: any Pipe<.Input = String> {
    didSet { reconnectToOutput() /* not shown here */ }
  }

  // This is an 'any' because its type might change
  // when clients call one of the functions below
  var output: any Pipe<.Output = String>

  func connectCapitalizingPipe()
    // Conforms to Pipe, with Input == Output == String
    let capitalizingPipe = CapitalizingPipe()  
    
    connectPipesAndLog(a: input, b: capitalizingPipe)
    output = capitalizingPipe
  }

  func connectLowercasingPipe()
    // Conforms to Pipe, with Input == Output == String
    let lowercasingPipe = LowercasingPipe()  
    
    connectPipesAndLog(a: input, b: lowercasingPipe)
    output = lowercasingPipe
  }
}

I'm not saying this is the most compelling use case, but if we're adding generalized constraints on opaques and existentials, I don't see what part if this is "out of bounds". If this code is crossing a line, I'm curious what exactly that line is.

I like the example—but, it's also a strong argument for the parallel pitch on opening existentials which serves this and a great many other cases. With that feature, you'd be able to just use a plain generic connectPipesAndLog, which of course has significant benefits in not requiring users to box a when they have a concrete type (with the associated performance benefits), not to mention access to the entire API surface of the concrete type for the author of the function itself. To my mind, if one could choose between having either implicitly opened existentials and a generic connectPipesAndLog or the constraints feature you describe here and an existential parameter a for connectPipesAndLog, I would recommend the first scenario every time.

As I mentioned above, it's kind of a two-part question I'm asking here: (1) is it usable at all (i.e., can users actually call these functions and can these functions do useful things with their arguments—and thanks for illustrating that it can be done, but as @John_McCall pointed out above, involving something at the wider scope of a type here), but also (2) is it useful (i.e., are there use cases served by this expressivity that cannot be equivalently or even better served written differently)?

Consider the recent move to (finally, IMO) spell existentials with any and to give a convenient shorthand for generic parameters with some. In current versions of Swift, users would naturally reach for the existential box without ever needing an existential box, and this was really a pain point and a major ergonomic pitfall with Swift generics. Why not pessimize existential types even more or get rid of them entirely? Because of course they serve clearly purposes that generics and opaque types can't—i.e., sometimes you really need them. The tradeoff is that everyone who's really elbow-deep in Swift will need to master the distinction between some and any.

Is providing some "same constraint syntax" for both generic types and existential types going to reproduce the same problem, and is incurring that problem worth the hit if it isn't demonstrated that having such expressivity actually enables better code than not having it?

Although it is true that the connectPipesAndLog function could easily be generic here when the opening existential pitch comes through, @bjhomer's example still needs same-type (in this case just String) constraints on the associated types of the existentials that are stored in the class.