[Pitch 2] Light-weight same-type requirement syntax

benrimmington · February 6, 2022, 4:36am

Slava_Pestov:

My only concern is that then it's not clear if we want to allow writing an inheritance clause on the primary associated type name itself. Eg, is this valid?
protocol Foo<T : Equatable> {
  associatedtype T
}
Or should all requirements be stated on the associated type declaration itself then?

What do you think?

I think that should be allowed, if we want <T: Equatable> and <T> where T: Equatable to be interchangeable. The following syntax is already supported:

protocol Foo where T: Equatable {
  associatedtype T
}

xwu · February 6, 2022, 6:06am

The use of override for protocol requirements is a Swift-internal unofficial feature, which is not publicly documented and—having never even been pitched here—does not officially exist.

It was added in the run-up to ABI stability as the counterpart to @_nonoverride (double-check the spelling against the relevant compiler source), which allowed same-name requirements in more refined protocols to have their own ABI footprint.

If I recall, there is (or was) an internal compiler flag which can be used to require every redeclaration to be explicitly annotated either overriding or non-overriding, useful for ABI checking. Otherwise, outside this dialect of Swift for internal use only, redeclarations are implicitly one or the other (not confident as to which at this hour).

Alejandro_Martinez · February 6, 2022, 10:07am

Thanks for pitching this, I'm so glad with all the improvements the type system is getting ^^

That said I'm a bit hesitant with this one, I'm in two minds.

I agree that we need a way to specify constraints on opaque return types. As others I would have loved to see that pitched independently of a "primary associated types" sugar.
As far as I understand this just works for associated types privileged by the author of the protocol. This feels quite awkward. Isn't a "primary type" something that the usage side should decide, depending on what algorithm or data structure you are writing you may need to give the privilege of a primary type to one type or another.
If the author didn't privilege the type you need you can't use this feature at all. That seems backwards too. I think we need an unsugared form of this.
I fully agree that this introduced syntax is very appealing, it looks a lot like something Swift users are used to and that's a win! But this also brings my main concern, how do we explain this now. For years every new Swift user has asked "how do I make a protocol generic" (even if that doesn't is not the correct name), and we had to reply with "that's not a thing in Swift, Swift has associated types in protocols that every conformance can specify...", to which you get weird looks and a eventually "ok whatever". Even last week I was in a podcast talking about Swift and its evolution and the host asked about generic protocols . But now we have this syntax so ... how do we explain it? People will think this is what they were looking for al along, and maybe it is! But is it? I find it hard to evaluate something when there is no clear definitions or is not very clear what a newcomer to Swift will expect of this syntax, especially coming from other languages. What will a user coming from Java interpret from this? Kotlin? C++? Rust?... maybe we don't care about this but sill, how we explain this feature? Is it generic protocols? what is it if not? I've long thought that proposals should come with a section on teaching, it would also resolve most concerns people has every time something like this shows up.

Nothing is a deal breaker but I think there are some questions that are worth answering.

Tino · February 6, 2022, 12:23pm

I want to point out again that there is another huge issue which is at least connected with the lack of generic protocols (at least I think so, and so far, nobody could prove the theory wrong):
You cannot nest a protocol declaration in another type (and as there are no namespaces either, there is no way to nest protocols at all).

class Delegator {
  protocol Delegate { // sorry, does not work
    func doSomething()
  }
}

Wait, there's not even a generic declaration — why should this be related??

Well, Delegator could be generic, and nested types inherit those parameters, so this construct would be a backdoor to generic protocols. Under the current limitations, there is a conflict, and probably the easiest solution is to disallow nesting.
There are other possibilities, but I don't think there is any option that does not introduce some other kind of inconsistency, so the most elegant choice would be to lift the restriction — and if we would allow declaration of protocols in other types, disallowing generic protocols would be a quite artificial restriction.

Maybe I'm living in a completely different reality, but many topics discussed in the forum have just theoretical benefit, whereas being forced to write SomeClassDelegate instead of SomeClass.Delegate bugs me quite a lot.

xwu · February 6, 2022, 6:38pm

By contrast, I think this is the part of the pitched feature that is the most strong and convincing. The claim is that Collection has, semantically, a primary associated type (i.e., Element) by which it is most usefully parameterized, just like Array does.

Protocols aren't just bags of syntax but have semantic requirements, which exist to enable useful particular generic algorithms. As such, in the same way that functions and concrete types can be usefully parameterized, so too a protocol itself. And to your point, a conforming type can certainly be generic over another parameter separate from that the protocols to which it conforms.

The last time that the question about protocols being nested was brought up, as I recall, there was a tangled ball of gnarly issues that needed to be settled regarding capturing types from the outer scope even for the minimal viable product. The idea that, because it would be nice to write SomeClass.Delegate instead of SomeClassDelegate, therefore we should tackle nested protocols, and therefore we need generic protocols, and therefore we should not use angle brackets for another useful improvement in the language—I don't think this is reasonable to deem in scope for this discussion. Nor, mind you, would we likely want to contemplate a design for protocol namespacing which would cause users to stumble into the complexity of captured outer type aliases and generic nested protocols simply because they want to write SomeClass.Delegate.

Tino · February 6, 2022, 7:42pm

You make the thought about nested protocols sound like a complicated and stupid argument, but I don't think that is true:
You can nest classes in other types, you can nest structs in other types, you can nest enums in other types, and (I hope — never checked that ;-) you can do the same with actors. Protocols are the one big exception, and I fail to see why this ability is less useful or more complicated than in all the other cases.

So unless there is some fundamental problem (not just "it's some work to do it") with generic protocols, which really prevents us from removing the special case in the future, we should be very careful before we give a completely new meaning to a syntax which had a strong and direct connection to a core feature of the language for years.

KeithBauerXero · February 10, 2022, 1:03am

Then you need to go and look at Rust, which has and does exactly this, and which is the "prior art" that a lot of the people in this thread (myself included) are looking at when they express distaste for the proposed syntax. I don't think you can have this discussion in good faith without understanding how Rust's generic protocols work.

Generics in Swift are something that beginners struggle a lot with, and I spend a lot of time explaining that types within angle brackets are "input types" under the control of the programmer, and associated types are "output types", under the control of the code. This syntax muddies these waters unnecessarily, conflating an associated type with a generic parameter in certain syntactic positions.

The obvious, Swift-y way to spell this seems to me to be some Collection where .Element == String, or similar. That gives lots of flexibility to write more complex where clauses, and doesn't unnecessarily constrain Swift from adopting generic protocols later. Indeed, it seems like this kind of syntax is a necessary prerequisite for the special case proposed in this pitch?

Why are we so keen on jumping two steps forward, to a place where we've irretrievably shut off a possibility for the language?

Slava_Pestov · February 10, 2022, 2:47am

I'm not sure I agree about associated types being "output types". When you're writing code inside of a generic context, there is a dependency between generic parameters and their associated types in the sense that a conformance requirement on the former induces requirements on the latter (and the same for a concrete substitution on the caller's side) but otherwise they behave identically, as abstract type parameters. You can even constrain a generic parameter to an associated type:

// these two are essentially equivalent
func foo<S : Sequence, E>(_: S) -> E where S.Element == E
func foo<S : Sequence>(_: S) -> S.Element

To me, it seems there is a clear symmetry between Array<Int> and Sequence<Int>. I'd argue that burning the angle bracket syntax on the latter gives you something more broadly useful than multi-parameter type classes, which are an advanced feature that could easily use a different syntax.

This is itself a special case though. Imagine you want to return a tuple containing two sequences, both having the same element type. The most general way is to name the output parameters:

func foo() -> <T, U> where T : Sequence, U : Sequence, T.Element == U.Element

I'm not sure I follow.

Nothing in this pitch precludes introducing multi-parameter type classes in the future. They could use a different syntax. They could even use the same angle bracket syntax if we really wanted to, since we know at type resolution time if the identifier names a plain old protocol or a hypothetical future multi-parameter typeclass.

ksluder · February 10, 2022, 4:21am

I have been coming to this conclusion myself. It would be nice and tidy if the relationship between type parameters and associated types were isomorphic to that of functions parameters and return values, but I associated types seem (perhaps strictly?) more powerful.

I’m still unclear on whether all instances of generic protocols are multi-parameter typeclasses, and if so, whether they actually do suffer the fatal flaw of an ambiguous relation between the parameters:

ahti · February 10, 2022, 10:59am

I think the symmetry between Array<Int> and Sequence<Int> is appealing on the surface, but could do some damage to the learnability of generics.

One way I have repeatedly seen generics explained is this: Treat generics like there's a copy of the type/func for every possible generic parameter combination. I know this isn't how it's implemented and there are subtleties around method dispatch on values of generic type, but it's close enough to give a useful level of basic understanding.

With the proposed changes, this would no longer be quite so true. I understand yall are not trying to call this feature "generic protocols", but if both the declaration as well as the use site look exactly like generics, that distinction will be lost on most.

So now, since MyProtocol<Int> and MyProtocol<String> aren't different enough to be implemented by a type at the same time, that explanation for generics doesn't work anymore, or needs to be caveated with "you actually need to check if the thing in front of the angled brackets is a protocol in which case this isn't actually a generic generic and things behave slightly differently".

This is why I believe any feature that takes the existing generics syntax and combines it with protocols should support multiple conformances for different generic parameters, and merely using it as sugar for constraining associated types doesn't cut it.

gwendal.roue · February 10, 2022, 4:51pm

Do we know what are those differences? For example, imagine I see Foo<Int> out of context (I don't know what Foo is), and I have to use it:

What can go wrong if I think Foo is a generic type, when it actually is a protocol with a primary associated type?
What can go wrong if I think Foo is a protocol with a primary associated type, when it actually is a generic type?
What can go wrong if I do not think anything, and do not take care about distinguishing between protocols and generic types?

(These are not rhetorical questions, but genuine ones: I find your question interesting.)

Tino · February 10, 2022, 7:07pm

You'll run into trouble as soon as you want to store it:
let x: Foo<Int>, or collections of such a type are not allowed. This might even be more confusing if someone does not know of the difference at all.

benlings · February 10, 2022, 7:32pm

Aren’t these the same restrictions you would have with the bare protocol? (At least, in the future when you have to use any Foo, not the bare protocol name)

xwu · February 10, 2022, 7:32pm

Swift will require any Foo<Int>, diagnosing the issue at compile time. If a user doesn't know that Foo is a protocol instead of a concrete type, they will be informed of such. This isn't any different for a non-parameterized Bar that a user doesn't recognize as a protocol or concrete type.

@gwendal.roue's question, as I understand it, is about what can go wrong with the use of angle brackets in the manner pitched here if a user doesn't know if Foo is a generic type or parameterized protocol, and I am genuinely curious about this as well.

hooman · February 16, 2022, 5:15am

As stated in the previous pitch, I am in favor of this pitch and SE-0388. I think we should also keep exploring alternate ways of expressing and teaching generics to make it easier to properly teach and understand these concepts.

I believe Swift will become the language that introduces the majority of new programmers to generic programming in a few years. The argument that programmers coming to Swift are already familiar with C#/Java/C++'s take on generics will not hold true in a few years.

Although Swift has a C-like syntax to make it easier to learn, it should not be confined to it. Swift has already drifted from its initial similarity to C-like languages (It dropped C-style for loops, ++ -- operators, added defer, guard, try, etc.). Nowadays, if you look at idiomatic and well-written Swift code, it clearly looks distinct, and in my opinion, much nicer than other C-like languages. It has also deeply affected other C-like languages and they have besome more like Swift. It is good for Swift to lead the way, and I believe these changes are steps in that direction.

I see these changes as more than optional syntactic sugar: I see them as the preferred way to write swift code once they become accepted. The same way T? is a necessary sugar for Optional<T>. As such, I believe a deep revision of the Swift Programming Language book and other tutorials are necessary to bring these syntacic changes into the mainstream of the teaching materials for the language.

However, we should move cautionsly and carefully. We no longer have the luxury of breaking chages that early Swift had. In this regard, I have a question:

Is it possible to cherry-pick these new syntacic changes into an otherwise stable branch and release the corresponding toolchains for people to try?

gwendal.roue · February 16, 2022, 12:17pm

Hello,

The proposal brilliantly explains the advantages of Collection<String> over Collection where Element == String and other syntaxes. If very works well when the primary associated type is constrained to a concrete type such as String.

The proposal does not mention it, but I expect it works as well for subclasses: Collection<Base> would match collections whose elements are Base, or a subclass of Base:

class Base { }
class Child: Base { }
func f<S: Collection<Base>>(_ elements: S) -> S { ... }
let array: [Child] = ...
f(array) // OK

Maybe this feature should be made explicit in the proposal (as expected to work, or as expected NOT to work).

Now, since we're talking about derived types, I see that the proposal does not mention protocol constraints at all. What do you think, @Slava_Pestov, of extending the light-weight same-type requirement syntax to protocol and existentials as well?

Collection<any P> // ?
Collection<some P> // ?
Collection<some Collection> // ?
Collection<some Collection<String>> // ?

I'm not asking for this to enter this proposal, but I think that the Future Directions chapter should say something about those - or clearly say that these are not intended to be supported at all. It is a very natural extension of the current proposal, and I bet users will be drooling for it sooner or later.

EDIT: I don't have any opinion about subclasses and protocols, to be honest. If the intent of the designer is to restrict the light-weight same-type requirement syntax to equality check (==), and not subtyping (:) in Collection where Element == ..., that's OK for me: the heavy syntax is still there when I want to express a complex constraint. Yet, I'm not quite happy that this is not made explicit in the proposal. Maybe such a restriction does create a problem, and an explicit sentence in a proposal would help reviewers.

ktoso · February 18, 2022, 8:11am

I've been thinking about this one for quite a while especially since it really looks like what Scala always allowed one to do (which I have a lot of "prior life" experience with).

I made some time to verify if I remember things correctly and if this gives us the same / better / worse expressive power and conciseness, or if there is something to "steal" from Scala 3 which I've not worked with yet, so needed to polish up my knowlage a little bit (3 is fairly recent, with a well proven type system calculus).

Overall:

I think this proposal is fine, +1
- I think Swift actually reads much nicer than the equivalent Scala idioms which is a nice bonus!
I would really want to lift the arbitrary "just one primary" restriction; In places where I can see myself using these it immediately is more than one type parameter. So, please, no artificial restrictions

Minor:

I don't really love the naming of "primary" associated type
- this could be one area to lean on Scala's prior art where those are just called type parameters... the same as on methods/functions. Do we need to draw a distinction here? "Primary" doesn't convey to me what this does or where it appears somehow...

Below my thoughts and double-checking if this is more, less or equally powerful and concise as Scala 2 (and 3) offer, which I consider to be a very powerful typesystem worthy to compare to and steal ideas from whenever we can

For reference, the equivalent spellings of the proposed here features in Scala (3) look like this:

editable source: Scastie - An interactive playground for Scala.

Notice that what we do in Swift with a nice where clause Scala has to dance around with defining a refinement type with the types being set like T { type A = X } that gets pretty annoying; our where clause is much nicer when the types get long.

So my primary concern to explore was if we're able to consider Require associated type names, e.g. Collection<.Element == String> after all, but after thinking more about it and comparing with Scala where the tradeoff is basically:

type parameters, have to specify them all, they are by-order
- same as the proposal's "primary associated types"
via type bindings
- same as Swift's already existing where clause; where the Where clause is actually much nicer already.

I ended up with the conclusion that this proposal is fine and offers enough conciseness, and when one wants the longer spelling, there's always the where clause...

Overall, +1 and I'm happy to see this -- I really hope we'll allow multiple such type parameters / primary associated types, because a single one is pretty much a show stopped for any interesting types I'd have an interest to use this feature with

Thank you for the proposal! And I hope you don't mind the bit of Scala here, but I thought it's important to compare a similar powerful typesystem and see how we compare

DevAndArtist · February 18, 2022, 1:45pm

Where? I’ve seen this argument several times in this thread but this proposal does not introduce the ability to write the where clause to achieve the same, nor is it already allowed, at least not in positions this sugar is aiming for. It’s this syntax only or nothing!

xwu · February 18, 2022, 3:52pm

As has been stated, this proposal is strongly motivated by aligning Collection<Base> with Array<Base>; it would be highly surprising if that didn’t hold with respect to classes.

Karl · February 18, 2022, 6:34pm

I think it is worth considering some of the ideas for a general syntax more closely.

The Benefits of the Alternative

These are the benefits (of the alternative syntax), as described in the proposal:

Require associated type names, e.g. Collection<.Element == String>

Explicitly writing associated type names to constrain them in angle brackets has a number of benefits:

Doesn’t require any special syntax at the protocol declaration.

Explicit associated type names allows constraining only a subset of the associated types.

The constraint syntax generalizes for all kinds of constraints e.g. <.Element: SomeProtocol>

I think this wording fails to capture precisely how big of a difference this is. To be clear, the alternative spelling including named associated types would:

Work for every protocol ever written in Swift, without any code changes needed by library authors.
Allow constraining more associated types
Allow more kinds of constraints

In other words, it would be more capable in every conceivable aspect than the syntax being proposed. The proposal itself admits it - it would scale to far more use-cases before you face "the cliff" where the new syntax can't handle your constraints and you have to rewrite everything using where clauses.

IMO, this is a really strong alternative. So I'm looking for really strong arguments against it.

But I'm not seeing them.

Argument 1: Declarations should be self-documenting

This is how the proposal argues against the alternative syntax:

There are also a number of drawbacks to this approach:

No visual clues at the protocol declaration about what associated types are useful.

Let's just be clear, here: this is talking about the declaration site; the code in the standard library containing the words public protocol Collection: Sequence { ... } or whatever. And it argues that there should be some loud, prominent notice which says "Hey! This associated type is useful -- and everything else is not!"

Essentially, this is an attempt to make protocol declarations self-documenting, and goes well beyond its remit to improve the syntax of using generics. Personally, I don't think this should be a goal; we have this amazing new documentation engine, and this is precisely what it is designed to do. It gives library authors complete freedom to curate their documentation, and decide how they present important concepts to their users. That's the place where we should communicate which associated types are "useful".

Apple has excellent, professional technical writers to handle this, so perhaps it is easy to forget: users don't learn how to use frameworks like the standard library or SwiftUI by looking at protocol declarations. They learn by reading the documentation.

I've learned from experience that for libraries with any real complexity, even smart, experienced developers will struggle to understand how a library works and how its concepts fit together unless the library's author took the time to produce well-structured documentation. It's hard (it's really hard), but there just isn't any substitute.

Argument 2: Too Verbose

The use-site may become onerous. For protocols with only one primary associated type, having to specify the name of it is unnecessarily repetitive.

Firstly, the idea that requiring a label is so terribly verbose that we should limit the expressive abilities of the language seems to be antithetical to Swift's entire design. We don't allow users to omit function argument labels, even if they are considered verbose.

This point also seems in conflict the previous argument - why is it so important to have visual clues at the declaration site, which developers generally don't view too often, but at the same time we should remove visual clues at the usage site, which developers view all the time?

For anything other than the most selective examples of the most basic protocols, unnamed parameters are, in general, not clear. They lose that lovely feature of Swift functions that almost read like prose.
Swift does not optimise brevity over clarity.

So, "Does this proposal fit well with the feel and direction of Swift?"
No, I absolutely don't think it does.

It's hard to find examples of protocols with associated types. They're very limited today, so it seems like a lot of frameworks try to avoid them. But here is one example, from Foundation:

// Developers define new attributes by implementing AttributeKey.
@available(macOS 12, iOS 15, tvOS 15, watchOS 8, *)
public protocol AttributedStringKey {
    associatedtype Value : Hashable
    static var name : String { get }
}

This seems like a good candidate for a "primary associated type", doesn't it? Well, let's see how it looks in practice:

func tag(
  _ string: inout AttributedString,
  with: some AttributedStringKey<Int>
                                 ^^^ - huh? is this the key type?
)

func tag(
  _ string: inout AttributedString,
  with: some AttributedStringKey<.Value == Int>
                                 ^^^^^^^^^^^^^^ - Much clearer
)

The version with labels also reads much better; it accepts "some AttributedString key whose value has type Int". It has that fantastic clarity that you get from Swift functions - and I think that's far more important when we consider how easy the language is to learn, and how approachable it is for newcomers.

How about other popular protocols, like Identifiable? It also seems like a good candidate. This is what the actual declaration looks like - actually, very well documented IMO. No need for extra visual clues here (although of course, it will need to add them regardless to get the new syntax).

@available(SwiftStdlib 5.1, *)
public protocol Identifiable {

  /// A type representing the stable identity of the entity associated with
  /// an instance.
  associatedtype ID: Hashable

  /// The stable identity of the entity associated with this instance.
  var id: ID { get }
}

So how does this look if we make ID a primary associated type?

func updateItems(_ items: [some Identifiable<Int>]) { ... }
                                             ^^^
                                  Huh? Are the items Ints?
                                Are these "identifiable Int"s?

func updateItems(_ items: [some Identifiable<.ID == Int>]) { ... }
                                             ^^^^^^^^^^
                                       Oh. That is just totally clear.

And again, it reads much better. This function accepts an array of items; the type of those items conforms to Identifiable, and its ID is of type Int. They are not "identifiable Int"s, and that distinction is made clear with very little extra syntax. I would not call this "verbose".

Let's also look at some more advanced examples. What does it look like to express a 2D collection? Let's pretend we adopt the extension to allow some X to express a subtype constraint:

func test(_: some Collection<Int>) // The most basic use-case.
func test(_: some Collection<some Collection<Int>>) // Nested angle brackets.
func test(_: some Collection<some Collection<some Hashable>>) // some some you what?

The syntax quickly breaks down in to a mess of nested some types and, crucially, nested angle brackets. Remember "angle bracket blindness"? Meanwhile, if we have the ability to constrain named associated types, we can avoid a lot of that:

func test(_: some Collection<.Element = Int>) // The most basic
func test(_: some Collection<.Element: Collection, .Element.Element = Int>) // No nesting. No repeat 'some's.
func test(_: some Collection<.Element: Collection, .Element.Element: Hashable>) // And it scales.

Now, this is still a complex generic signature - a collection of collections of hashable elements - but I think the second version is easier to read because I don't need to track the hierarchy in my mind to know how deep I am in the signature; there is no nesting. And the noise of all the some keywords is greatly reduced.

And if this syntax was extended to more than one associated type, I think the parameter labels become even more important. This is an example adapted from Doug's recent proposal:

func test(_: some DictionaryProtocol<some Hashable & Codable, Pair<some Codable, some Codable>>

// vs

func test(_: some DictionaryProtocol<.Key: Hashable & Codable, .Value = Pair<some Codable, some Codable>>
                                     ^^^^^                     ^^^^^^^^

I think separating the Key and Value types here adds a lot of value. It is just much clearer at the point of use, like Swift function calls are. It just looks like Swift. It just fits, I think.

No, it is not the shortest, tersest possible syntax - but Swift just isn't that language. We don't optimise for brevity over clarity; we do the opposite. At least, that's how I've always understood the language.

And I think in all of the cases I've shown, those labels do actually add value. Let's not dismiss them out of hand.

Argument 3: Not a big improvement

This more verbose syntax is not as clear of an improvement over the existing syntax today, because most of the where clause is still explicitly written. This may also encourage users to specify most or all generic constraints in angle brackets at the front of a generic signature instead of in the where clause, which goes against SE-0081.

Because this is already so long, I'll refer to a previous post on this issue.

In short: there is still great value, because we bring the constraints much closer to the thing they apply to. Currently, generic signatures are chopped up with pieces at the start, middle, and end of the function signature. Consolidating them has value.

As for SE-0081? This proposal also goes against SE-0081! It's been 6 years since that proposal was accepted, and we can certainly use that experience to opt for a different direction. I don't buy the idea that we should be constrained by SE-0081.

Idea for how to proceed

IMO, we should proceed by implementing the syntax with named constraints. As we've seen, it has a lot of benefits, it is clear, and it is scalable.

We can then, as a future extension, discuss allowing unnamed parameters in to that list. The way I see it, it would be like a proposal to omit function labels if the function has one parameter:

// This is the analogy:
func doSomething(with: Int) { ... }
doSomething(42)
            ^^ - removes the 'with'

func doSomething(with: some Collection<.Element == Int>)
func doSomething(with: some Collection<Int>)
                                       ^^^ - removes the '.Element =='

Personally, at this stage, I don't think I would be in favour of that, but I think it is a separate discussion we should have, with costs and benefits that we can evaluate separately.