Protocol<.AssocType == T> shorthand for combined protocol and associated type constraints without naming the constrained type

(Joe Groff) #1

Swift's notation for generic constraints generally requires naming the things being constrained; you say that a particular generic parameter conforms to a protocol by naming the generic parameter and the protocol it conforms to, like <C: Collection>, and you put further constraints on its associated types by naming them in a where clause, like C.Element: Equatable. However, opaque result types, generalized existentials, and other conceivable future language features need a way to describe constraints on a type that doesn't otherwise have a name; an opaque type is intentionally hidden from the interface, and an existential's contained type is dynamic and can change at runtime. I can see this evolving into its own major design discussion so I figured it's a good idea to spin this off from the initial opaque result types proposal.

To kick things off, I'd like to suggest borrowing another idea from Rust here: In Rust, you can use T: Trait<Assoc = Type> as shorthand for a combined constraint that T implements Trait and that T.Assoc is same-type-constrained to Type, as if you'd written T: Trait where T.Assoc = Type. This shorthand can also be used in Rust's equivalents of opaque types (impl Trait<Assoc = Type>) and existentials (dyn Trait<Assoc = Type>) in addition to generic type constraints. If we were going to do something similar for Swift, we could generalize it a bit so that the shorthand can be used for both protocol and same-type constraints on associated types and to allow constraints between associated types. The result might look something like this:

// Leading dot is used to refer to an associated type
T: Protocol<.Type == Int> // T: Protocol where T.Type == Int
T: Protocol<.TypeA == .TypeB> // T: Protocol where T.TypeA == T.TypeB
T: Protocol<.Type: OtherProtocol> // T: Protocol where T.Type: OtherProtocol

We can allow this syntax in any context where we allow generic constraints, including as part of existential, composition, or opaque result types. The shorthand can express any set of constraints we have today without having to name the constrained type. In addition to providing a usable syntax for these new language features, I think it's also a nice, more compact encoding for generic constraints in general. Being a variation on Rust's design, it aligns with another prominent language, and it also draws syntactic analogy to other languages that use generic interfaces instead of associated types to describe generic constraints among multiple types.

The main alternative to this design I see is one that's come up in previous iterations of discussions about existentials and opaque result types. Many people have suggested to use a standard where clause with a placeholder like _ to name the unnamable type. For an opaque result type, this would look like:

func foo() -> some Collection where _.Element == Int { ... }

and for a generalized existential, it might look like:

var myInts: Any<Collection where _.Element == Int>

This has the advantage of being an incremental extension of Swift's existing syntax, and it definitely reads better as a sentence. However, I'm not a fan of this direction for a number of reasons:

  • This overloads the _ token, something Swift has thus far managed to avoid for the most part (and a common complaint about Scala in particular).

  • For opaque result types, there's only one where clause for the entire declaration, and that where clause would commingle constraints on the opaque return type itself with constraints on the function's other generic arguments, meaning the opaque type is no longer syntactically self-contained. This is an implementation and readability challenge.

  • The magic token _ only scales to one implicitly-named opaque or existential thing. In the fullness of time, one could imagine a function supporting multiple opaque return types:

    func twoCollections() -> (some Collection, some Collection)

    or Swift also growing to support some notation on arguments:

    func duplicate(_ collection: some inout RangeReplaceableCollection) -> some RangeReplaceableCollection

    Existentials too could conceivably generalize to the point that there are multiple existentially-qualified generic types at play. Using _ doesn't answer the question of how to apply constraints individually to each anonymized types in these situations.

The Protocol<.AssocType == T> shorthand, by contrast, doesn't rely on a magic token, so it avoids overloading an existing token like _ or ascribing magic meaning to a magic identifier. Furthermore, it generalizes well to multiple anonymous things, since the set of constraints on each opaque thing can be written in a self-contained notation. Here's a table to compare the shorthand I'm proposing with the where clause placeholder:

Feature Protocol<...> notation where notation
Generic constraint <T, U: Protocol<.A == T, .B: P>> <T, U: Protocol> ... where U.A == T, U.B: P
Opaque type func foo<T>() -> some Protocol<.A == T, .B: P> func foo<T>() -> some Protocol where _.A == T, _.B: P
Existential var x: Protocol<.A == Int, .B: P> var x: Any<Protocol where _.A == Int, _.B: P>

I'd be interested to hear other suggestions as well as feedback on these two possible approaches. Thanks!

SE-0244: Opaque Result Types
SE-0244: Opaque Result Types
Reverse generics and opaque result types
Improving the UI of generics
(Jon Shier) #2

Sorry, but can you compare examples of the different syntaxes more directly? It hard to fully tell the difference when the proposed syntax examples are hypothetical are the others more real.

(Matthew Johnson) #3

This looks like a nice direction! Can you post an example of how this would look when you want to constrain a type to multiple unrelated protocols and use a same type constraint to equate an associated type from one of the protocols to an associated type from a different protocol? Something like this, but I’m not sure I have it quite right:

protocol P {
    associatedtype A
protocol Q {
    associatedtype B

func f<T: (P & Q)<.A == .B>>(_ t: T) {}

(Xiaodi Wu) #4

I like the idea in general, but I don't see the point of trying to shoehorn this into such an ambiguous sigil as . already is. While the keyword is admittedly long, your comment is crying out for someone to ask: why not use associatedtype to refer to an associated type?

T: Collection<associatedtype Element == Int>

(Joe Groff) #5

I tried adding a table to compare the syntaxes in some different situations. Let me know if there are particular examples you'd like highlighted.

(👑🦆) #6

I've changed my mind on this a bit recently, and I think that angle-brackets should stay well away from protocols unless we introduce true parameterisation for them. It's too confusing for users, who would likely prefer to write Collection<.Element == Int> because it's similar to other languages. We even have diagnostics for that specific mistake. Unfortunately, generics and protocols work in entirely different ways and mixing them creates a kind of conceptual mess.

Different parameterisations of a generic type (MyStruct<Int>/MyStruct<String>) are distinct in Swift. However, protocols with differently-bound associated types (say, Collection<.Element == Int> and Collection<.Element == String>) are the same protocol. But it looks like a parameterised protocol and you'd expect to be able to conform to both simultaneously.

(Joe Groff) #7

It'd be interesting to me to hear whether people find this notation in Rust to be confusing. To me, same-type-constraining an associated type of a protocol seems perfectly isomorphic to binding the argument of a generic struct. In a language without side effects, a protocol would in effect be a lazification of a struct:

// A strict point
struct Point<T: FloatingPoint> {
  var x, y: T

// A conforming type can be lazily evaluated as a point
protocol Point {
  associatedtype T: FloatingPoint
  var x, y: T

In either case, any specific value can only be one specific Point<T> or Point<.T == T>, and constraining the type parameter or associated type has the same effect on the available interface on the type.

(Adrian Zubarev) #8

@Joe_Groff a few questions.

Protocol is a placeholder for a composed types right? I do miss in Swift something like generalized type constraints which @anandabits pitched a few times already. That would allow more complex type constraints and things like upgrading a generic type locally. if let some = t as? T & SomeProtocol. What I mean here is that I think Protocol is a little confusing as I do expect more types to appear at that composition position, not only protocols or classes.

Is your pitched syntax flexible enough that it can be shared? Here I would like to create a type alias that can be used as an existential. Then if required I can just add a keyword to it so it becomes an opaque type.

typealias IntCollection = Collection<.Element == Int>
opaque typealias _IntCollection: IntCollection = [Int]

To be honest I would prefer a syntax that looks more like Collection where Element == Int instead.

(Mox) #9

I might be mistaken but Element the feels ambiguous without anything rooting it to Collection.

So alternatives would be named:
C: Collection where C.Element == Int
Or anonymous:
Collection where .Element == Int

Personally I prefer named, but since this proposal is about anonymous, I’d choose the dot syntax. It feels more logical. I understand if <> are necessary for avoiding ambiguities, otherwise I’d also prefer ”where”.

Underscore notation in this context is confusing. Underscore is currently used in Swift as a placeholder for property (i.e instance of type), yet here it’s referring to a type, and you’re expected to make a connection between Collection and _. It’s not similar at all to me.

(Brent Royal-Gordon) #10

This touches on my main concern with SE-0244, which is that the some P syntax may not extend well once we add where clauses, multiple opaque returns, or other plausible features.

I'd like to suggest an alternative solution: Make some always* be an anonymous shorthand for some syntax involving named generic parameters. For instance (strawman syntax abounds here):

Generic parameter
Anonymous func f1(_: some Collection)
Named func f1<C: Collection>(_: C)
Where clause func f1<C>(_: C) where C: Collection
Opaque result type
Anonymous func f2() -> some Collection
Named func f2<result C: Collection>() -> C
Where clause func f2<result C>() -> C where C: Collection
Opaque typealias
Anonymous typealias OpaqueCollection: some Collection = ConcreteCollection
Named typealias OpaqueCollection<result C: Collection> = ConcreteCollection
Where clause typealias OpaqueCollection<result C> = ConcreteCollection where C: Collection
Generalized existential
Anonymous Any<some Collection>
Named Any<C: Collection>
Where clause Any<C where C: Collection>

If you used a named form, you could reuse the same type in multiple positions, constrain it, etc. (Or at least you could write those things—the compiler might not support some of them.) If you used an anonymous form, you wouldn't be able to express those things, but you could always transform to a named form. We might even be able to provide a local refactoring to do it for you.

* I'm not necessarily suggesting that SE-0244 needs to be rejected because it doesn't have a named form yet, but if we went in this direction, we'd want to add one in the next release.

SE-0244: Opaque Result Types
SE-0244: Opaque Result Types (reopened)
SE-0244: Opaque Result Types

Is another alternative here that you could be forced to always give a name to the type? You say

but it's not immediately obvious to me if this is true. If you're forced to somehow name your opaque type then all the usual syntax can just apply directly (e.g. ignoring specific syntax, something like func foo<T>() -> some O: Protocol where O.A == T, O.B: P or func foo<T>() -> some O where O: Protocol, O.A == T, O.B: P). Does this not work in some context? Is it too onerous to have to give it a name?

(Joe Groff) #12

That's certainly part of my motivation here. A design that always attaches a name to the opaque/erased thing could work too, but I think there's some benefit to a notation that reduces the amount of names a user has to think about. As they say, naming things is one of the hardest problems in computer science, and names impose a cognitive overhead on the reader to keep track of what the names represent. Notation that reduces names can reduce cognitive load; this is why we like programming languages with expression syntax instead of writing assembly language or LLVM IR, and why it's simpler to write foo(_: P) than foo<T: P>(_: T).

It seems to me that, with the proposed syntax, you should be able to express almost anything you would be able to express with where clauses, with a few open questions. As @anandabits noted, there are a few possible answers for where the <> ought to go in a protocol composition when relating associated types from different protocols. Also, if we did introduce multiple opaque types in a declaration, you would need to introduce extra generic parameters to be able to relate associated types across the opaque types, e.g. to say that two opaque arguments and their return type all return collections with the same Element, you'd write:

func concatenate<Element>(a: some Collection<.Element == Element>,
                          b: some Collection<.Element == Element>)
  -> some Collection<.Element == Element>

instead of directly same-type-constraining the three .Element associated types.

(David Hart) #13

I really like this syntax. It resolves many issues we'll have in future proposals for opaque types and for future generalised existentials.

The only slight issue I have is that if its used too heavily in generic constraints, it tends to make declarations less readable: I like using the <> syntax to name generic types and relegate constraints to the where. The shorthand does tend to make the <> part heavier.

(Adrian Zubarev) #14

Honestly I'm not completely sure I like it as it seems that it will block generic protocols to be ever introduced. At least that is my impression if you compare the following two protocols:

protocol P {
  associatedtype T

protocol Q<T> { ... }

func foo(_ p: P<.T == Int>) { ... }
func bar(_ q: Q<Int>) { ... }

On the other hand if we can unambiguously use the same syntax for both features, then I think I'll be totally fine with it.

protocol Z<T> {
  associatedtype R

func baz(_ z: Z<Int, .R == Int>) { ... }

@regexident do you know if the latter is possible in Rust?

(Joe Groff) #15

In Rust, you can combine them. If we added generic parameters to Swift protocols, we could do the same. The leading . helps disambiguate these embedded associated type constraints from generic parameters.

(Adrian Zubarev) #16

Well then I'm sold on the syntax as this would be definitely an advantage over the previous iteration of the pitched syntax forms. ;)

(Joe Groff) #17

One next step I'd like to see after this is to allow opaque type notation for arguments as well as results, which should also help reduce the weight of generic constraints by allowing many of them to get pushed down to the arguments they constrain.

(David Hart) #18

Opaque types for arguments? How does that work? Clients can't promise to always pass the same concrete type.

(Joe Groff) #19

The analog of an opaque type to an argument would be a unique generic parameter. The type is "opaque" to the callee. You could write foo<T: P>(x: T) as foo(x: some P).

(Rob Mayoff) #20

(emphasis added)

Did you mean “a combined constraint that T implements Trait and that…”?