[Pitch 2] Light-weight same-type requirement syntax

Karl · February 18, 2022, 6:34pm

I think it is worth considering some of the ideas for a general syntax more closely.

The Benefits of the Alternative

These are the benefits (of the alternative syntax), as described in the proposal:

Require associated type names, e.g. Collection<.Element == String>

Explicitly writing associated type names to constrain them in angle brackets has a number of benefits:

Doesn’t require any special syntax at the protocol declaration.

Explicit associated type names allows constraining only a subset of the associated types.

The constraint syntax generalizes for all kinds of constraints e.g. <.Element: SomeProtocol>

I think this wording fails to capture precisely how big of a difference this is. To be clear, the alternative spelling including named associated types would:

Work for every protocol ever written in Swift, without any code changes needed by library authors.
Allow constraining more associated types
Allow more kinds of constraints

In other words, it would be more capable in every conceivable aspect than the syntax being proposed. The proposal itself admits it - it would scale to far more use-cases before you face "the cliff" where the new syntax can't handle your constraints and you have to rewrite everything using where clauses.

IMO, this is a really strong alternative. So I'm looking for really strong arguments against it.

But I'm not seeing them.

Argument 1: Declarations should be self-documenting

This is how the proposal argues against the alternative syntax:

There are also a number of drawbacks to this approach:

No visual clues at the protocol declaration about what associated types are useful.

Let's just be clear, here: this is talking about the declaration site; the code in the standard library containing the words public protocol Collection: Sequence { ... } or whatever. And it argues that there should be some loud, prominent notice which says "Hey! This associated type is useful -- and everything else is not!"

Essentially, this is an attempt to make protocol declarations self-documenting, and goes well beyond its remit to improve the syntax of using generics. Personally, I don't think this should be a goal; we have this amazing new documentation engine, and this is precisely what it is designed to do. It gives library authors complete freedom to curate their documentation, and decide how they present important concepts to their users. That's the place where we should communicate which associated types are "useful".

Apple has excellent, professional technical writers to handle this, so perhaps it is easy to forget: users don't learn how to use frameworks like the standard library or SwiftUI by looking at protocol declarations. They learn by reading the documentation.

I've learned from experience that for libraries with any real complexity, even smart, experienced developers will struggle to understand how a library works and how its concepts fit together unless the library's author took the time to produce well-structured documentation. It's hard (it's really hard), but there just isn't any substitute.

Argument 2: Too Verbose

The use-site may become onerous. For protocols with only one primary associated type, having to specify the name of it is unnecessarily repetitive.

Firstly, the idea that requiring a label is so terribly verbose that we should limit the expressive abilities of the language seems to be antithetical to Swift's entire design. We don't allow users to omit function argument labels, even if they are considered verbose.

This point also seems in conflict the previous argument - why is it so important to have visual clues at the declaration site, which developers generally don't view too often, but at the same time we should remove visual clues at the usage site, which developers view all the time?

For anything other than the most selective examples of the most basic protocols, unnamed parameters are, in general, not clear. They lose that lovely feature of Swift functions that almost read like prose.
Swift does not optimise brevity over clarity.

So, "Does this proposal fit well with the feel and direction of Swift?"
No, I absolutely don't think it does.

It's hard to find examples of protocols with associated types. They're very limited today, so it seems like a lot of frameworks try to avoid them. But here is one example, from Foundation:

// Developers define new attributes by implementing AttributeKey.
@available(macOS 12, iOS 15, tvOS 15, watchOS 8, *)
public protocol AttributedStringKey {
    associatedtype Value : Hashable
    static var name : String { get }
}

This seems like a good candidate for a "primary associated type", doesn't it? Well, let's see how it looks in practice:

func tag(
  _ string: inout AttributedString,
  with: some AttributedStringKey<Int>
                                 ^^^ - huh? is this the key type?
)

func tag(
  _ string: inout AttributedString,
  with: some AttributedStringKey<.Value == Int>
                                 ^^^^^^^^^^^^^^ - Much clearer
)

The version with labels also reads much better; it accepts "some AttributedString key whose value has type Int". It has that fantastic clarity that you get from Swift functions - and I think that's far more important when we consider how easy the language is to learn, and how approachable it is for newcomers.

How about other popular protocols, like Identifiable? It also seems like a good candidate. This is what the actual declaration looks like - actually, very well documented IMO. No need for extra visual clues here (although of course, it will need to add them regardless to get the new syntax).

@available(SwiftStdlib 5.1, *)
public protocol Identifiable {

  /// A type representing the stable identity of the entity associated with
  /// an instance.
  associatedtype ID: Hashable

  /// The stable identity of the entity associated with this instance.
  var id: ID { get }
}

So how does this look if we make ID a primary associated type?

func updateItems(_ items: [some Identifiable<Int>]) { ... }
                                             ^^^
                                  Huh? Are the items Ints?
                                Are these "identifiable Int"s?

func updateItems(_ items: [some Identifiable<.ID == Int>]) { ... }
                                             ^^^^^^^^^^
                                       Oh. That is just totally clear.

And again, it reads much better. This function accepts an array of items; the type of those items conforms to Identifiable, and its ID is of type Int. They are not "identifiable Int"s, and that distinction is made clear with very little extra syntax. I would not call this "verbose".

Let's also look at some more advanced examples. What does it look like to express a 2D collection? Let's pretend we adopt the extension to allow some X to express a subtype constraint:

func test(_: some Collection<Int>) // The most basic use-case.
func test(_: some Collection<some Collection<Int>>) // Nested angle brackets.
func test(_: some Collection<some Collection<some Hashable>>) // some some you what?

The syntax quickly breaks down in to a mess of nested some types and, crucially, nested angle brackets. Remember "angle bracket blindness"? Meanwhile, if we have the ability to constrain named associated types, we can avoid a lot of that:

func test(_: some Collection<.Element = Int>) // The most basic
func test(_: some Collection<.Element: Collection, .Element.Element = Int>) // No nesting. No repeat 'some's.
func test(_: some Collection<.Element: Collection, .Element.Element: Hashable>) // And it scales.

Now, this is still a complex generic signature - a collection of collections of hashable elements - but I think the second version is easier to read because I don't need to track the hierarchy in my mind to know how deep I am in the signature; there is no nesting. And the noise of all the some keywords is greatly reduced.

And if this syntax was extended to more than one associated type, I think the parameter labels become even more important. This is an example adapted from Doug's recent proposal:

func test(_: some DictionaryProtocol<some Hashable & Codable, Pair<some Codable, some Codable>>

// vs

func test(_: some DictionaryProtocol<.Key: Hashable & Codable, .Value = Pair<some Codable, some Codable>>
                                     ^^^^^                     ^^^^^^^^

I think separating the Key and Value types here adds a lot of value. It is just much clearer at the point of use, like Swift function calls are. It just looks like Swift. It just fits, I think.

No, it is not the shortest, tersest possible syntax - but Swift just isn't that language. We don't optimise for brevity over clarity; we do the opposite. At least, that's how I've always understood the language.

And I think in all of the cases I've shown, those labels do actually add value. Let's not dismiss them out of hand.

Argument 3: Not a big improvement

This more verbose syntax is not as clear of an improvement over the existing syntax today, because most of the where clause is still explicitly written. This may also encourage users to specify most or all generic constraints in angle brackets at the front of a generic signature instead of in the where clause, which goes against SE-0081.

Because this is already so long, I'll refer to a previous post on this issue.

In short: there is still great value, because we bring the constraints much closer to the thing they apply to. Currently, generic signatures are chopped up with pieces at the start, middle, and end of the function signature. Consolidating them has value.

As for SE-0081? This proposal also goes against SE-0081! It's been 6 years since that proposal was accepted, and we can certainly use that experience to opt for a different direction. I don't buy the idea that we should be constrained by SE-0081.

Idea for how to proceed

IMO, we should proceed by implementing the syntax with named constraints. As we've seen, it has a lot of benefits, it is clear, and it is scalable.

We can then, as a future extension, discuss allowing unnamed parameters in to that list. The way I see it, it would be like a proposal to omit function labels if the function has one parameter:

// This is the analogy:
func doSomething(with: Int) { ... }
doSomething(42)
            ^^ - removes the 'with'

func doSomething(with: some Collection<.Element == Int>)
func doSomething(with: some Collection<Int>)
                                       ^^^ - removes the '.Element =='

Personally, at this stage, I don't think I would be in favour of that, but I think it is a separate discussion we should have, with costs and benefits that we can evaluate separately.