[Pitch] Light-weight same-type constraint syntax

Updated pitch to reflect this. Thank you for a very good point, @davedelong!

While attractive, I'm a bit wary.

I had thought that many key folks (some even on the core team?) in earlier discussions had explicitly not used angle brackets for associated types because those brackets imply generic parameters, while these protocols are pointedly not generic--a point of confusion that has been visited at some length during even the original discussions about the Generics Manifesto.

As someone who subscribes to the philosophy that similar things should look similar and different things different, I am both curious as to what has changed in the interim to put this syntax back on the table and a little worried that the concerns regarding them are legitimate and haven't been addressed.

For example, the detailed design section of this pitch immediately details how the syntax cannot appear anywhere except extension declarations and generic constraints, speaking to how borrowing the syntax of generics naturally invites users to try to use the syntactic sugar like actual generic constraints in ways that we must forbid.

Moreover, I'm concerned as to whether this lightweight syntax makes the right thing lightweight. Take the example provided as motivation:

extension Collection where ??? == String {
  func concatenate(with: Collection<String>) -> Collection<String> {
    ...
  }
}

Do we want users to reach for this method signature by analogy with the Array<String> example? It may be sort of reasonable here with the semantics of this particular method, but in the general case it could be a bit of a trap, as the argument and return value here are of existential type and the dynamic type of the return value spelled in this way would be totally unrelated to either the argument's dynamic type or that of Self--which isn't the case for Array.

There's already a lot of difficulty for users in learning how to use generic constraints versus associated types, existentials versus opaque types, etc. I am not super confident that what's being made more lightweight in this pitch should be so light relative to the syntax of these other related features in trying to achieve the right balance that nudges users towards their most correct use.

28 Likes

I'm only going to address one question from your reply at the moment:

We just wanted to be explicit about the semantics here, use of this syntax could be extended to more places but we'd like this proposal to be more targeted since extending it to e.g. opaque result types would require implementing another features as pointed out in the same section, although it feels like a natural progression for it.

3 Likes

I completely agree with this point, and I think the fact that reaching for existential types is far easier than generics, especially given Swift's emphasis on the power of value types and static type safety, is a really important problem to solve. Even without the sugar in this pitch, this is already a problem because programmers often abstract away concrete types with existential types without fully realizing that they're erasing important static type information, and better solution might be to use a type parameter.

I've posted a discussion topic about exactly this, and I'd love to hear your thoughts:

4 Likes

I agree with @xwu - the difference between generic parameters and associated types is a fundamental thing. As nice as this syntax is, it makes me uncomfortable to mix the concepts this way.

The way I've always thought about it is that generic parameters are part of the type's identity - Array<Int> and Array<Bool> are entirely unrelated types in the Swift type system, despite both being Arrays.

Associated types are different - every conformer has its own version of an associated type, so they really define the way a type conforms to a protocol, rather than the protocol type itself. Collection<Int> and Collection<Bool> are not different protocols - there is just one Collection protocol, and you conform to it with one Element type.

In a constraint position, I think code which is generic over a protocol but cares very much about a particular associated type is rather rare. I think we would want people to try using broader constraints where possible, e.g. Numeric or StringProtocol over same-type Int and String constraints.

3 Likes

I think this is a valid point. The reason why this new syntax is proposed is purely pragmatic, it accounts for the fact that other languages don't have all the distinctions between protocols and generic types Swift does, so unification of the syntax here helps us to archive multiple goals at the same time - make it simpler/concise to work with protocols, unify syntax for handling of generic/type constraints, in combination with other possible features mentioned in [Discussion] Easing the learning curve for introducing generic parameters - #8 by hborla it makes the language more intuitive, and helps with progressive disclosure for people switching from other languages. Also since the syntax is the same, that helps the type-checker to understand what people are trying to express and provide useful suggestions.

3 Likes

I'm fine with tackling the extension and user-site part of the issues, but I'm strongly against misusing the generic parameterization on protocols declaration just for associated types. This simply kills off any potential future for generic protocols (an expert feature) and to me this feels like an "shut up already" move (please don't take that offensively).

As for the call-site, I still prefer the some Collection<.Element == String> spelling, because it leaves me with more control over the type of constraints I want to create. I still can write some Collection<.Element == Foo, .Index == Int>, while without an explicit associated type reference, not only would I be forced to use every previous generic parameter until I reached the one for the Index associated type, but I will need to memorize the exact order of the (inherited) associated types from that protocol which would just increase the cognitive load on me.

The order of associated types never mattered, so why would we want to make that a thing, especially in combination with some sugar code?!

I strongly think that any Collection<.Element == String, .Index == Int> should be the same existential as any Collection<.Index == Int, .Element == String>

Yes you have to type a bit more, but we’re not trying to eliminate every bit of unwanted characters here aren’t we? We want to unlock a new constraining feature using protocols, existentials and opaque types.

protocol Foo {
  associatedtype A
  associatedtype B 
}

protocol Bar: Foo {
  associatedtype C
}

_: any Bar<Int> // what is the first parameter here? is it `C` or `A`?
// if it was `A` then writing an existential that constrains `C` will be a bit painful and a hit or miss
_: any Bar<_, _, Int>

// On the flip side, the referenced associated type has non of these issues, 
// except being a bit more verbose
_: any Bar<.A == Int> // okay
_: any Bar<.C == Int> // okay
_: any Bar<.C == Int, .A == Int> // order does not matter

While this discussion focussed is only on same-type constraint. Can I ask why we shouldn't ever explore a sub-type constraint on the use-site?

Bikeshedding code:

// It's not important what concrete type `.Element` would have,
// it's only important that it conforms to `Foo`
_: some Collection<.Element: Foo, .Index == Int>

Something like this would be completely non-expressible with the syntax as pitched.

8 Likes

We have provided some pro/cons analysis regarding use of . syntax in Alternatives Considered section. The primary reason why we didn’t pitch it instead is related to the incremental disclosure principle, we didn’t want to introduce a new variant of angle brackets spelling because the goal here is not make the language more expressive but instead to unify the concepts.

1 Like

No offense, I've read the pitched analysis but to me this proposal still tries to jump over too many hoops. If anything, the first step to unlock the same-type constraint should be received by using the where clause.

typealias AnyCollection<T> = Collection where Self.Element == T

Only then we should even consider starting to think about sugar code. This approach would be truly and safely additive to the language. In fact, it's also where the Collection<.Element == T> syntax came from, as it signals exactly why the explicit reference to the associated type is needed on the call-site.

So far, from my personal point of view, the proposed sugar in the current state of the pitch would not improve anything except unlocking the ability to write the constraint, but the use-site will be extremely limited as not only does it seem to increase the cognitive load on me but it also will prevent future extension of general language features.

That said, in my opinion, the ability to write a few less characters and not introducing any new spelling has too many tradeoffs in the long term, which is simply not worth it.

6 Likes

What are the trade-offs you are talking about besides not being able to spell generic protocols?

I'm going to continue using the non-existing any keyword in context of explicit protocol as types (existentials), just for the sake of clarity in the example code.

If we'd force primary associated types, I think the call-site would become too restricted and the declaration-site to be too opinionated during the protocol design phase. I see no reason for having different types of associated types (primary vs. non-primary). I as a protocol user may want to constraint any of the associated types, and I would prefer to have that ability equally exposed through whatever syntax will be the final. As mentioned before, associated types have no specific order, but by mimicking a sub-set of associated types (primary associated types) as generic type parameters would enforce just that.

// today
protocol Foo {
  associatedtype A
  associatedtype B 
}

// what I would expect
typealias AnyFoo_1<A> = Foo where Self.A == A
typealias AnyFoo_2<B> = Foo where Self.B == B
typealias AnyFoo_3<A, B> = Foo where Self.A == A, Self.B == B

// sugared 
any Foo<.A == SomeA>
any Foo<.B == SomeB>
any Foo<.A == SomeA, .B == SomeB>
// the order remains irrelevant
any Foo<.B == SomeB, .A == SomeA>

// ==============
// what you propose

// we're forced to decide which associated type
// is a primary one and which is not
protocol Foo_1<A> {
  associatedtype B 
}

protocol Foo_2<B> {
  associatedtype A 
}

protocol Foo_3<A, B> {}

// the issue that I see here
any Foo_1<SomeA> // cannot constrain B
any Foo_2<SomeB> // cannot constrain A
// forced to use placeholder type if I'm
// not interested in constraining A
any Foo_3<_, B> 

That said, I'd rather type a few more characters but keep the ability to be able to express the exact constraint I need when and where I need it.

If the constrained type is still too long, I can shorten it by using a typealias.

typealias AnyFoo<A, B> = any Foo<.A == A, .B == B>

If we're fine with some more bike shedding. The sugar code that I personally might be okay with, could look something like this:

// explicit leading dot to indicate that those are still 
// associated types and not true generic type parameters
// on the protocol
protocol Foo<.A, .B> {}

// desugared
protocol $Foo {
  associatedtype A
  associatedtype B
}
typealias Foo<A, B> = $Foo<.A == A, .B == B>

So if I needed to regain the flexibility I need, I can opt-out and refer to the $Foo protocol directly.

The prefix $ was randomly picked and is considered as bikeshedding. Feel free to expand on this idea if you want.

6 Likes

This got me a bit off guard, but it actually makes sense.

I'm not too happy that this adds another concept (primary associatedtypes) but in a way (which is what the proposal states) it's actually putting this concept first and leaving the rest for more experienced devs, which is cool.

I'd just like to have a way to extend on a subset of primary constraints, like

protocol MyProtocol<Element, Index> {
  associatedtype Constraint
}

extension MyProtocol<Int, _> { ... }

_ has been used in the generics context as a placeholder to be inferred by the compiler, but I think the semantics fits with the pattern matching catch-all

(note: I know this can be still done with where clauses)

I expect many people will be cheerful when they see examples like Collection<Int>, imho this will be followed by a huge disappointment: You still won't be able to have an Array<Collection<Int>> (will you?), and that kind of issue is probably the most annoying limitation of PATs.
There is already quite a lot confusion caused by the fact that protocols and "protocols with associated types" are actually two different kinds of beast, and I'm convinced that mixing in generics syntax as well would make the situation worse:
There would not only be two kinds of entity with identical spelling, but also two spellings for a single concept — and one of those already has a slightly different meaning!

There are other very interesting possibilities of what Protocol<T> could mean, so I don't think it's a good idea to settle with the pitched interpretation now: Topics like nesting protocols in other types and named generic parameters should be decided before starting to work on an implementation for the pitch.

12 Likes

I was initially pleased with the shorter syntax, but I don’t think this is a good goal because the concepts really are very different. In particular, any two mentions of Array<Int> refer to the same type, while any two mentions of Collection<Int> don’t; this will lead to all sorts of confusion, including the disappointments @Tino described. (And if we do make “array of constrained existential” a possible type, we definitely don’t want it to be overly convenient!)

I think the proposed change would achieve its goal of making it easier to step into certain kinds of generic programming, but at the cost of making it harder to advance beyond that point because it’s harder to see when new concepts are being introduced. On top of that, extending the Collection<Int> syntax to declarations would make it even easier to reach for existentials, and not doing so would leave a confusing and aggravating inconsistency in the language.


On the other hand, I don’t buy into this:

Collection.Element is clearly more prominent than Collection.Index, and this extends to all sorts of functors/monads/containery types that have type-level extension points in addition to their primary content type. I would welcome syntax that recognizes this distinction but doesn’t fully conflate associated types with generic parameters.

1 Like

I agree that there are cases where certain associated types are "more important" than others — but I don't think that there is a clear distinction in general. This might even shift in the lifetime of a library, so it's another tough decision for authors to make on behalf of their users.

Even when it's obvious that some associated types are not primary, I don't think an annotation has value on its own; actually, I'd say it's also not necessary to have special documentation for this, because users will realise what they need anyway.

In general, I'm in favour of making generic parameters more like function parameters: Allowing default values, and first and foremost, don't force us to hide their names (Array<Element: Int>).
If the latter would be possible, I don't think adding a new spelling for associated types would make sense at all, because people would be used to parameter names which would resolve any ambiguity.

Even for the obvious example of Collection, you might not be interested in Collection<Element: Int>, but rather Collection<Index: Int> (imagine you want every index divisible by a certain number, or with some other distinct features). Afaics, that wouldn't be possible with the pitched syntax (at least it would be inconvenient), so even if it's decided to "burn" the spelling Protocol<T>, the pitched way of declaring protocols would be seriously hampered.

1 Like

A big +1 on anything that @Karl and @DevAndArtist already wrote int this discussion, but most importantly (emphasis mine):

2 Likes

Generic Protocols

Generic protocols are a major gap in Swift's current generics and opportunistically giving them up in favor of a syntactic sugar for a completely different feature that could be expressed syntactically at a minimal cost using existing syntactic elements via <.T == …> would be a huge mistake, imho.

Unification

That's because the T of Array<T> and Array.Element are two completely different things serving two completely different purposes and roles in the language, which are strictly orthogonal to each other. Of course they are used and spelled differently. It would be bad if they weren't.

  • Generic arguments are passed in by the user of a generic type and provided at the time of its instantiation (i.e. Array<T> -> Array<Int>).
  • Associated types are returned out by the specific instantiation of a generic type at the time of its declaration (i.e. typealias = …).

If generic types/kinds were functions of the type system, mapping from a generic abstract type to a concrete type, then …

  • generic arguments would be the inputs of such a function.
  • associated types would be the outputs of such a function.

So not only would such a unification conflate two orthogonal concepts that cannot be used interchangeably into a single syntax, it also introduces a major inconsistency by doing so:

Every existing use of "instantiate a generic type" (i.e. pass a set of types as its input) in Swift (that I'm aware of) is using <T>.
Every existing use of "constrain a generic type" (i.e. retrieve a set of types as its output) in Swift (that I'm aware of) is using where T == _.


Just imagine for a second a hypothetical and admittedly silly proposal that wanted to unify the passing of inputs and the returning of outputs for a function into a single syntax and went something like this:

We propose that instead of passing values to a function via foo(42, "bar") and obtaining its output via let baz = foo(…) you can now do let foo(42, "bar", baz), where 42 and "bar" are the inputs to foo(…) and baz is its output (i.e. a local binding), resulting in a unified syntax.

A proposal like this would be shut down immediately, if not outright ridiculed, as it would be conflating two orthogonal concepts (inputs vs. outputs) into a shared syntax and effectively would make it impossible to know the semantics implied by (…) without looking up the function signature. Even though it effectively proposes the same thing proposed here: unify syntax of inputs and outputs of a function (on a kind), making them look the same, even though they are orthogonal. It's just that the semantics of "function of types" is familiar to most us, while the semantics of "function of kinds" is somewhat more obscure, yet rally not that different. What's bad in one context is similarly bad in the other.

7 Likes

It's bad enough that Swift already promotes generic arguments to implicit associated types when their names match, which I feel might actually be the motivation behind this proposal. But the names of a type's generic arguments should not ever leak, unless one explicitly exports them as associated types.

Apart from this we already have the unfortunate situation that the same syntax can mean completely different things (and be both valid and invalid, based on the specific context), just based on whether a generic argument has been implicitly promoted to an associatedtype, or not:

protocol Foo {
    associatedtype Bar

    init(bar: Self.Bar)
    
    func bar() -> Self.Bar
}

struct DummyA<Bar>: Foo {
    var _bar: Bar
    
    init(bar: Bar) {
        self._bar = bar
    }
    
    func bar() -> Bar {
        self._bar
    }
}

struct DummyB<Bar>: Foo {
    typealias Bar = Int
    
    var _bar: Self.Bar
    
    init(bar: Self.Bar) {
        self._bar = bar
    }
    
    func bar() -> Self.Bar {
        self._bar
    }
}

This code compiles just fine.

Notice however how the Bar in DummyA and the Self.Bar in DummyB both refer to the associatedtype Bar of Foo, yet writing DummyA with the valid syntax of DummyB like this fails…

struct DummyA<Bar>: Foo {
    // This line works, for whatever reason.
    var _bar: Self.Bar
    
    // error: 'Bar' is not a member type of generic struct 'DummyA<Bar>'
    init(bar: Self.Bar) {
        self._bar = bar
    }
    
    // error: 'Bar' is not a member type of generic struct 'DummyA<Bar>'
    func bar() -> Self.Bar {
        self._bar
    }
}

… as does any attempt of writing DummyB with the valid syntax of DummyA:

// error: type 'DummyB<Bar>' does not conform to protocol 'Foo'
struct DummyB<Bar>: Foo {
    typealias Bar = Int
    
    var _bar: Bar
    
    // note: candidate has non-matching type '(bar: Bar)'
    init(bar: Bar) {
        self._bar = bar
    }
    
    // note: candidate has non-matching type '<Bar> () -> Bar'
    func bar() -> Bar {
        self._bar
    }
}

(Why these snippets don't compile doesn't matter here. What matters is that the illusion of "generic arguments and associated types are the same" simply doesn't hold water in practice.)

The code above is semantically (and syntactically) inconsistent (ironically for syntactic consistency's sake) due to exactly the same orthogonality of generic arguments vs. associated types that we're dealing with in this proposal and it will lead to very similar issues.

And to make things worse: by having unified their syntax it is now much, much more difficult for the user to understand why code that works for one doesn't work for the other or worse: why the simple addition of a typealias suddenly broke the whole type.

The implicit promotion of generic arguments to associated types is a bug, not a feature.

Let's please not use it as a justification for making yet another mess, attempting to unify the syntax around generic arguments and associated types. Two wrongs don't make a right.

8 Likes

This would also make it easier for the compiler to understand what is being implied by the syntax. With Array<Collection<Int>>, for example, it would be possible to provide a proper diagnostic and a fix-it to convert collection to a generic parameter with a proper same-type constraint and would type-check when applied.

I think this looks really great. I fall on the pragmatic side of making my life easier and this makes things much much simpler and easier to use. Kudos. I have a couple questions:

  1. I assume this to be the case but it's only possible to name ALL of the primary associated types?
protocol MyProtocol<Element, Index> {
  associatedtype Constraint
}

For example, can you use MyProtocol<Int> for the above protocol or do you always need to type both?

  1. Theres some code in the pitch that mentions returning a Collection<String>. I know it's mostly to point out that the ??? part is difficult. Or is the pitch also suggesting that you could return a Collection<String>? If so, what is that type?
extension Collection where ??? == String {
  func concatenate(with: Collection<String>) -> Collection<String> {
    ...
  }
}
Terms of Service

Privacy Policy

Cookie Policy