[Pitch] Light-weight same-type constraint syntax

I expect many people will be cheerful when they see examples like Collection<Int>, imho this will be followed by a huge disappointment: You still won't be able to have an Array<Collection<Int>> (will you?), and that kind of issue is probably the most annoying limitation of PATs.
There is already quite a lot confusion caused by the fact that protocols and "protocols with associated types" are actually two different kinds of beast, and I'm convinced that mixing in generics syntax as well would make the situation worse:
There would not only be two kinds of entity with identical spelling, but also two spellings for a single concept — and one of those already has a slightly different meaning!

There are other very interesting possibilities of what Protocol<T> could mean, so I don't think it's a good idea to settle with the pitched interpretation now: Topics like nesting protocols in other types and named generic parameters should be decided before starting to work on an implementation for the pitch.

12 Likes

I was initially pleased with the shorter syntax, but I don’t think this is a good goal because the concepts really are very different. In particular, any two mentions of Array<Int> refer to the same type, while any two mentions of Collection<Int> don’t; this will lead to all sorts of confusion, including the disappointments @Tino described. (And if we do make “array of constrained existential” a possible type, we definitely don’t want it to be overly convenient!)

I think the proposed change would achieve its goal of making it easier to step into certain kinds of generic programming, but at the cost of making it harder to advance beyond that point because it’s harder to see when new concepts are being introduced. On top of that, extending the Collection<Int> syntax to declarations would make it even easier to reach for existentials, and not doing so would leave a confusing and aggravating inconsistency in the language.


On the other hand, I don’t buy into this:

Collection.Element is clearly more prominent than Collection.Index, and this extends to all sorts of functors/monads/containery types that have type-level extension points in addition to their primary content type. I would welcome syntax that recognizes this distinction but doesn’t fully conflate associated types with generic parameters.

2 Likes

I agree that there are cases where certain associated types are "more important" than others — but I don't think that there is a clear distinction in general. This might even shift in the lifetime of a library, so it's another tough decision for authors to make on behalf of their users.

Even when it's obvious that some associated types are not primary, I don't think an annotation has value on its own; actually, I'd say it's also not necessary to have special documentation for this, because users will realise what they need anyway.

In general, I'm in favour of making generic parameters more like function parameters: Allowing default values, and first and foremost, don't force us to hide their names (Array<Element: Int>).
If the latter would be possible, I don't think adding a new spelling for associated types would make sense at all, because people would be used to parameter names which would resolve any ambiguity.

Even for the obvious example of Collection, you might not be interested in Collection<Element: Int>, but rather Collection<Index: Int> (imagine you want every index divisible by a certain number, or with some other distinct features). Afaics, that wouldn't be possible with the pitched syntax (at least it would be inconvenient), so even if it's decided to "burn" the spelling Protocol<T>, the pitched way of declaring protocols would be seriously hampered.

1 Like

A big +1 on anything that @Karl and @DevAndArtist already wrote int this discussion, but most importantly (emphasis mine):

2 Likes

Generic Protocols

Generic protocols are a major gap in Swift's current generics and opportunistically giving them up in favor of a syntactic sugar for a completely different feature that could be expressed syntactically at a minimal cost using existing syntactic elements via <.T == …> would be a huge mistake, imho.

Unification

That's because the T of Array<T> and Array.Element are two completely different things serving two completely different purposes and roles in the language, which are strictly orthogonal to each other. Of course they are used and spelled differently. It would be bad if they weren't.

  • Generic arguments are passed in by the user of a generic type and provided at the time of its instantiation (i.e. Array<T> -> Array<Int>).
  • Associated types are returned out by the specific instantiation of a generic type at the time of its declaration (i.e. typealias = …).

If generic types/kinds were functions of the type system, mapping from a generic abstract type to a concrete type, then …

  • generic arguments would be the inputs of such a function.
  • associated types would be the outputs of such a function.

So not only would such a unification conflate two orthogonal concepts that cannot be used interchangeably into a single syntax, it also introduces a major inconsistency by doing so:

Every existing use of "instantiate a generic type" (i.e. pass a set of types as its input) in Swift (that I'm aware of) is using <T>.
Every existing use of "constrain a generic type" (i.e. retrieve a set of types as its output) in Swift (that I'm aware of) is using where T == _.


Just imagine for a second a hypothetical and admittedly silly proposal that wanted to unify the passing of inputs and the returning of outputs for a function into a single syntax and went something like this:

We propose that instead of passing values to a function via foo(42, "bar") and obtaining its output via let baz = foo(…) you can now do let foo(42, "bar", baz), where 42 and "bar" are the inputs to foo(…) and baz is its output (i.e. a local binding), resulting in a unified syntax.

A proposal like this would be shut down immediately, if not outright ridiculed, as it would be conflating two orthogonal concepts (inputs vs. outputs) into a shared syntax and effectively would make it impossible to know the semantics implied by (…) without looking up the function signature. Even though it effectively proposes the same thing proposed here: unify syntax of inputs and outputs of a function (on a kind), making them look the same, even though they are orthogonal. It's just that the semantics of "function of types" is familiar to most us, while the semantics of "function of kinds" is somewhat more obscure, yet rally not that different. What's bad in one context is similarly bad in the other.

7 Likes

It's bad enough that Swift already promotes generic arguments to implicit associated types when their names match, which I feel might actually be the motivation behind this proposal. But the names of a type's generic arguments should not ever leak, unless one explicitly exports them as associated types.

Apart from this we already have the unfortunate situation that the same syntax can mean completely different things (and be both valid and invalid, based on the specific context), just based on whether a generic argument has been implicitly promoted to an associatedtype, or not:

protocol Foo {
    associatedtype Bar

    init(bar: Self.Bar)
    
    func bar() -> Self.Bar
}

struct DummyA<Bar>: Foo {
    var _bar: Bar
    
    init(bar: Bar) {
        self._bar = bar
    }
    
    func bar() -> Bar {
        self._bar
    }
}

struct DummyB<Bar>: Foo {
    typealias Bar = Int
    
    var _bar: Self.Bar
    
    init(bar: Self.Bar) {
        self._bar = bar
    }
    
    func bar() -> Self.Bar {
        self._bar
    }
}

This code compiles just fine.

Notice however how the Bar in DummyA and the Self.Bar in DummyB both refer to the associatedtype Bar of Foo, yet writing DummyA with the valid syntax of DummyB like this fails…

struct DummyA<Bar>: Foo {
    // This line works, for whatever reason.
    var _bar: Self.Bar
    
    // error: 'Bar' is not a member type of generic struct 'DummyA<Bar>'
    init(bar: Self.Bar) {
        self._bar = bar
    }
    
    // error: 'Bar' is not a member type of generic struct 'DummyA<Bar>'
    func bar() -> Self.Bar {
        self._bar
    }
}

… as does any attempt of writing DummyB with the valid syntax of DummyA:

// error: type 'DummyB<Bar>' does not conform to protocol 'Foo'
struct DummyB<Bar>: Foo {
    typealias Bar = Int
    
    var _bar: Bar
    
    // note: candidate has non-matching type '(bar: Bar)'
    init(bar: Bar) {
        self._bar = bar
    }
    
    // note: candidate has non-matching type '<Bar> () -> Bar'
    func bar() -> Bar {
        self._bar
    }
}

(Why these snippets don't compile doesn't matter here. What matters is that the illusion of "generic arguments and associated types are the same" simply doesn't hold water in practice.)

The code above is semantically (and syntactically) inconsistent (ironically for syntactic consistency's sake) due to exactly the same orthogonality of generic arguments vs. associated types that we're dealing with in this proposal and it will lead to very similar issues.

And to make things worse: by having unified their syntax it is now much, much more difficult for the user to understand why code that works for one doesn't work for the other or worse: why the simple addition of a typealias suddenly broke the whole type.

The implicit promotion of generic arguments to associated types is a bug, not a feature.

Let's please not use it as a justification for making yet another mess, attempting to unify the syntax around generic arguments and associated types. Two wrongs don't make a right.

8 Likes

This would also make it easier for the compiler to understand what is being implied by the syntax. With Array<Collection<Int>>, for example, it would be possible to provide a proper diagnostic and a fix-it to convert collection to a generic parameter with a proper same-type constraint and would type-check when applied.

I think this looks really great. I fall on the pragmatic side of making my life easier and this makes things much much simpler and easier to use. Kudos. I have a couple questions:

  1. I assume this to be the case but it's only possible to name ALL of the primary associated types?
protocol MyProtocol<Element, Index> {
  associatedtype Constraint
}

For example, can you use MyProtocol<Int> for the above protocol or do you always need to type both?

  1. Theres some code in the pitch that mentions returning a Collection<String>. I know it's mostly to point out that the ??? part is difficult. Or is the pitch also suggesting that you could return a Collection<String>? If so, what is that type?
extension Collection where ??? == String {
  func concatenate(with: Collection<String>) -> Collection<String> {
    ...
  }
}

Yes, the proposal actually explicitly mentions that because we don't use .<Name> = ... syntax here we have to support all-or-nothing only to positionally match associated types.

It's not suggesting that, it's purely to show progression of generalization for concatenate.

1 Like

Thanks makes sense. +1 on the pitch!

Nice pitch. Auto promote/map generic type to associatedtype significantly reducing where associatedtype==type constraint burden and will be more intuitive to find out where the type is materialized, it's much more like explicit assiciatedtype declaration.

I think the proposal makes sense, because it makes sense for a protocol to have what the proposal calls "primary associated type", but I don't think that the "Require associated type names" alternative is actually an alternative to this: the two complement each other perfectly, and I'd love to see them in the same proposal. Let me elaborate on this.

Both generic parameters and associated types suffer, in my opinion, of a potential problem of clarity at the usage site. Types like Array<Element> or Dictionary<Key, Value>, when specialized, are generally pretty clear: Array<Int> means "an array of ints". But many generic types are not like that, and even something as simple as Result could be potentially confusing. Consider for example:

// somewhere in the codebase
extension String: Error {}

// somewhere else
let someResult: Result<Int, String> = ...

after using Result for a while, one understands that the first generic parameter is the Success type, and the second is the Failure, but by just reading Result<Int, String> it's not clear which is which (in fact, many languages that have a similar type in their stdlib consider the second parameter as the success one).

I think it would be great if I could write, for example, Result<Int, .Failure = String>.

Let's consider a couple of libraries from the Pointfree people: swift-tagged and swift-composable-architecture.

swift-tagged is based on the very useful Tagged<Tag, RawValue> type, that can be used to replace a plain raw value with a new one that carries some type information, in order to write better self-documenting code. But the meaning and positioning of the generic parameters can be confusing. A usage example is the following:

struct User {
  let id: Tagged<User, Int>
}

what's the RawValue of id? it's User or Int? I think it would be clearer if one could write something like let id: Tagged<Int, .Tag = User>.

swift-composable-architecture is based on the Reducer<State, Action, Environment> type, where each generic parameter represents a specific aspect of that particular Reducer. Unfortunately, when reading something like:

let x: Reducer<Int, String, [String: Int]>

it's impossible to understand what's what. It would be great if we could specify which type parameter we're specializing with which type.

This kind of reasoning applies also to protocols, and the examples in the pitch clearly show this. If I see Collection<String> in a type constraint, I immediately understand that I'm dealing with a collection of String; but there would be no point in having the Index associated type as primary, because for example Collection<String, Int> is simply confusing. But a Collection<String, .Index == Int> as type constraint would be perfect.

The distinction between primary and non-primary, when it comes to both associated types and type parameters, is very useful in my opinion, and having the option to write Collection<String> opens the possibility of using the angle brackets syntax for more stuff, like the very nice, I think, .Index == Int declaration, without the need to add anything extra for the primary types, precisely because they're primary. I think this approach would solve the drawbacks listed in the pitch, and I don't think it goes against SE-0081 that much, because the point is rendering the simplest most common case as concise and clear as possible, and progressively introducing more constraints as needed, eventually turning to the where clause in cases where a list of type constraints is sufficiently heavy to justify a change in code style.

3 Likes

I know that currently you can make a typealias for a protocol and use it as a constraint. Which is pretty confusing to me (I would expect it to alias existential), but it actually useful.

But this leads me to an idea, which may address some of the negative feedback.

What if instead of polluting the protocol itself with declarations of primary associated types, this information is externalised into a a typealias:

typealias CollectionOf<Element> = Collection where Self.Element == Element
func foo<C: CollectionOf<Int>>(numbers: C) {}

Actually, as I mentioned before, using typealias for that purpose is confusing. IMO, this deserves a separate entity:

constraint CollectionOf<Element> = Self: Collection where Self.Element == Element
func foo<C: CollectionOf<Int>>(numbers: C) {}
1 Like

With any it would be clear and constraint wouldn't be required.

I'm very against this. As @regexident says, this syntax is important for a major missing Swift feature, generic protocols.

This feature is used extensively in Rust, where it forms the backbone of generic conversion and operator overloading, and is one of the things I miss most when moving between the languages. I understand that the implementation in Swift is difficult and the idea has been passed over before for various reasons, but I remain convinced that this is an eventual necessity for Swift.

the obvious interpretation of this syntax is as a generic protocol, so this usage of this syntax will inherently be misleading. You can't actually use it how it looks:

extension Data: Collection<UInt8> { ... }
extension Data: Collection<UInt32> { ... }

is still an error due to overlapping conformances.

Likewise, you can't make any number of overlapping conformances:

protocol RawBytes { ... }
extension Data: Collection<T> where T: RawBytes { ... }

But also, it's lying about fundamental truths of Swift syntax. For better or worse, <T> means T is an input type, under the control of the programmer at that moment. With this proposal, <T> has an alternate meaning as a constraint of some invisible associated type to be equal to T.

I agree again with @regexident that an appropriate syntax for this is <.AssociatedType == T>, which

  • more or less matches Rust's syntax for the same purpose
  • is explicit about exactly which associated types are being constrained
  • doesn't require adding the concept of a "primary" associated type to the language
  • is flexible with regard to adding many constraints of many associated types (just use commas to separate the constraints)
  • is orthogonal to existing syntax rather than confusingly overlapping
  • allows us to keep the possibility of generic protocols open for the future (we're gonna want 'em)
17 Likes

I think it’s worth recalling that generic protocols have been listed under “Unlikely” in the Generics Manifesto since approximately the Palaeozoic era, and I have yet to see anything that suggests this is going to change. As such, it seems manifestly unreasonable to require that everyone maintains a carve-out for a feature that has effectively already been rejected.

That said, I agree that the pitched syntax would be misleading (and while I have no particular interest in generic protocols, I find the argument against them in the manifesto surprisingly weak).

1 Like

Yes, the argument against them is basically "you don't often need this", which may be fair, but when you do need it you really need it. I suspect that as Swift's generics get fleshed out a bit more, the absence will be all the more notable.

4 Likes

Right—the argument isn't that the pitched syntax conflicts with a feature that may well never be added, but rather that the pitched syntax misleadingly appears as though that feature and yet doesn't work that way, and users are already confused as to what that feature is about.

To make a rough analogy, it isn't so much saying, "We shouldn't call this Full Self-Driving because what would we then call actual full self-driving?"—rather, it's: "We shouldn't call this Full Self-Driving because we have to explain in the same breath that users have to keep their hand on the steering wheel at all times to stop from driving into highway barriers." The rejoinder that, obviously, actual full self-driving doesn’t exist is a fair enough rebuttal against the first argument but bolsters the second argument.

13 Likes

In the example for extensions, it shows that the following would be valid:

extension MyProtocol<String, Int> { ... }

and

extension MyProtocol<String> { ... }

which is sugar for

extension MyProtocol where Element == String

what would the sugar look like for this:

extension MyProtocol where Index == Int

I suggest

extension MyProtocol<_, Int> { ... }

to retain the positional interface.

I would note that it also seems odd to me that

extension MyProtocol<String> { ... }

Would be allowed, as partial specification of generic is not permitted using compact syntax today

2 Likes

I like this idea. It's mentioned a couple times in the proposal that just like for generic types whole signature has to be specified or angle brackets cannot be used, so using _ for some of the parameters could make sense.

3 Likes