[Pitch 2] Light-weight same-type requirement syntax

Slava_Pestov · February 4, 2022, 8:05pm

I'm hesitant to draw this kind of inference. If folks feel that constraints on opaque result types are desirable but the current syntax is bad, they can state that and suggest an alternate approach, such as named opaque result types that one can refer to from the where clause, like this:

func returnSequenceOfString() -> <S> S where S : Sequence, S.Element == Int

The underlying mechanism to define requirements on opaque result types is already there, just not officially. In fact there is an experimental flag to enable named opaque result types (but I wouldn't rely on it for anything real of course).

Douglas_Gregor · February 4, 2022, 11:52pm

Sendable has the same issue, where we'd like it to act more like an effect that's carried through from the arguments to the opaque result type. I'd rather us tackle this issue with opaque result types directly, since it's independent of the sugar here.

Doug

Douglas_Gregor · February 5, 2022, 12:52am

regexident:

Even the Swift Generics Manifesto addresses "generic protocols" in the very same dismissive tone and in the same hand-wavy manner based on the weird straw-man argument that "people misuse the term" and thus the real thing is of no use to Swift:

One of the most commonly requested features is the ability to parameterize protocols themselves. […] Implicit in this feature is the ability for a given type to conform to the protocol in two different ways. […] Most of the requests for this feature actually want a different feature.

Not only is the last part evidently wrong (i.e. none of the request to "please consider generic protocols as future addition to the language" in this or the previous pitch discussion were "actually asking for a different feature" afaict), but it also fails to provide any proper argument against generic protocols in Swift whatsoever.

I wrote that bit of the Generics Manifesto many years ago, and that last part has held constant for many years since then: the vast majority of requests we get for "parameterized protocols" or "generic protocols" specifically ask to be able to write Collection<String> to get a collection of strings, which aligns with this pitch. Let's call that interpretation #1.

There are other interpretations we could give to this syntax as well, so let's consider them.

Interpretation #2 is multi-Self protocols, as Joe points out. The canonical example is expressing conversions, e.g., ConvertibleTo<Double, Int>:

protocol ConvertibleTo<Other> {
  func asOther() -> Other   // "Self" is the type we convert from, "Other" is the type we're converting to
}

Except in Swift we use converting initializers, so maybe that should have been ConvertibleFrom<Int, Double>?

protocol ConvertibleFrom<Other> {
  init(_: Other) 
}

We have to decide what type to privilege as Self, because all of the members of the protocol are on that Self. What if you want both the initializer (on the "to" type) and the asOther operation (on the "from" type), how would you express that? Joe's straw man syntax

protocol Convertible(from: T, to: U) {
  // ... 
}

would make it clear that both the "from" nor the "to" type are on equal footing, with neither being functionally dependent on the other, which is the right semantics for this protocol.

Interpretation #3 is what Michael mentions in this post, where one wants to conform the same type to the same protocol in multiple different ways. There is still a primary Self type (so it's not the multi-Self case from interpretation #2), but it can bind the generic parameter in various different ways. For example, this might mean that String conforms to Collection (of Character) and Collect (of UnicodeScalar) and maybe others. Note that we could allow this with the same syntax we have today by lifting the restriction on overlapping conformances and allowing one to somehow specify that a particular member is only visible through the protocol conformance, e.g.,

extension String: Collection {
  typealias Collection.Element = Character // only visible through this conformance to Collection
  // ...
}

extension String: Collection {
  typealias Collection.Element = UnicodeScalar // only visible through this conformance to Collection
  // ...
}

There are lots of details here with ambiguity resolution and such, because when you have a primary Self type it still makes sense to say "String is a Collection " but that statement becomes ambiguous. Call sites will be able to sort through the ambiguity if you have other information, e.g.,

func find<C: Collection>(_ value: C.Element, in collection: C) -> C.Index? { ... }

let char: Character = a
find(char, in: "Hello, world")  // okay, only the String conformance where Element == Character matches

Indeed, you start to want to have a convenient way to say "String is a Collection where the element type is Character". The natural syntax for that is probably Collection<Character>, which is interpretation #1 that's being pitched.

Perhaps you have another interpretation of "generic protocols" in mind, or disagree with my analysis of interpretations #2 and #3, but I think both #2 and #3 are better expressed with a syntax that isn't of the form "protocol A<B>", whereas #1 (this pitch) is a highly-requested shorthand.

Doug

swhitty · February 5, 2022, 1:02am

Improving the UI of generics has huge benefits for new and experienced developers alike. So +1 — I really like the direction of this pitch and think the scope is about right.

I appreciate the comments regarding the spelling with < > but I prefer this to all alternatives I have seen thus far. For better or worse, there is precedence in the language where conflation of syntax occurs e.g. inheritance and conformance.

DevAndArtist · February 5, 2022, 3:47am

Slava_Pestov:

DevAndArtist:

(A) The language already made a partial promise on the ability to create constraints using the where clause inside a typealais declaration, but it still hasn't delivered that feature to us.

This feature would need a detailed pitch outlining the precise semantics. Typealiases can appear in many places; if we're introducing a new kind of typealias we can to spell out where they're allowed and where they're forbidden, and what the behavior is in each case.

Another consideration is that a hypothetical typealias for where clauses would also not be as ergonomic as the concise syntax I'm promising in many instances because you'd have to name the alias.

DevAndArtist:

It completely removes the flexibility which we could have with (A) and (B). What do you want me the developer to write when for instance Collection has only a single primary associated type being Element but I still want to constrain Index as well?

The syntax is not meant to replace where clauses entirely. You can still write a where clause.

With all my respect to you Slava and the others involved in the development of Swift. This just shows that the current pitch is a workaround + sugar code to provide the user the ability to write some Collection<Whatever> (probably even in return positions) while not actually allowing nor tackling (A), which is kinda the core of the improvements that are forcefully pushed on us. I remember the days where everyone was screaming a huge "no" to every possible sugar code proposal. Again, don't take any offense from my message, but this proposal does not truly add any new feature except a workaround one via short sugar code just to avoid (A), regardless on how cumbersome it would look like for opaque result types or at parameter level.

So how exactly am I supposed to use the where clause to write these?

github.com

slavapestov/swift/blob/0177f754736c9e0ec59b589966588ed73c7a055c/test/type/parameterized_protocol.swift#L91-L97


      
          func returnSequenceOfInt() -> some Sequence<Int> {
            return ConcreteSequence<Int>()
          }
          
          func returnSequenceOfE() -> some Sequence<E> {
            return ConcreteSequence<E>()
          }

This syntax is literally forced on me, as I don't see any other alternative that I can use over which this proposal is trying to improve. So can we have the feature for (A) before we continue talking about how to apply sugar over it?

Other points outlined in the proposal do not actually add anything new to the language except that they shorten some code we have to type. In other words, if you'd exclude the opaque result type thing from this proposal, it would really not add anything new except a bit of sugar and block certain design space of the language.

Slava_Pestov · February 5, 2022, 4:09am

Syntax sugar has value if it makes code easier to read and write. An example of sugar is the inheritance clause of a generic parameter. There is really no reason for the language to allow you to write:

func reverse<S : Sequence>(_: S)

When a fully general where clause can express the requirement also:

func reverse<S>(_: S) where S : Sequence

Arguably though, the inheritance clause syntax is preferable for the restricted set of requirements it can express, as seen by its common usage in real-world Swift code. Similarly, the syntax proposed in this pitch also shorthand that makes code easier to read and write without really giving you anything new.

Similarly with associated types in protocols:

protocol Sequence {
  associatedtype Iterator : IteratorProtocol
}

is exactly equivalent to

protocol Sequence {
  associatedtype Iterator where Iterator : IteratorProtocol
}

Even where clauses on associated types are just sugar for where clauses on protocols; there is really no inherent reason to write

protocol Sequence {
  associatedtype Iterator : IteratorProtocol
  associatedtype Element where Element == Iterator.Element
}

when you can say

protocol Sequence where Iterator : IteratorProtocol, Element == Iterator.Element {
  associatedtype Iterator
  associatedtype Element
}

However attaching a where clause (or inheritance clause) to the associated type makes the protocol definition clearer; even more so in protocols with many associated types, such as Collection.

In the case of opaque result types, the sugar does give you something new, but even if we had the ability to name opaque result types and reference them from a general where clause, I think this pitch would still have value for the same reason -- it makes code easier to read and write by allowing you to avoid the where clause syntax except in more complex declarations where it is actually necessary.

I would suggest trying out the feature implemented in this proposal to see if it can clean up some of your own code, particularly when used with opaque parameter types.

Taken to the extreme, this philosophy would rule out most if not all of the changes introduced to the language since Swift 1.0. Everything is just sugar for lambda calculus once you have closures, function application and bindings.

DevAndArtist · February 5, 2022, 4:22am

Don't get me wrong, writing less code is always a good thing, but this argument hides the main issue here. This proposal is forcing a syntax on us for a feature which is not available nor will be possible with any alternative syntax after this proposal ships. You cannot call this part of the proposal as sugar code, because it isn't. In reality it sneaks in a new feature, takes away a design space of the language for future development in a potentially different direction and claims to be some sugar code. Sorry, but sugar code over what? Take away the opaque result type from the proposal for a brief moment, that's the only sugar this proposal actually adds, because it's impossible to write named or anonymous opaque result types with a where clause even after the proposal ships.

I clearly can understand that there is a lot of appetite to not to write named opaque result types, especially in a framework like SwiftUI which extensively makes use of it. However this is a totally different and NEW feature. It's not sugar code anymore. Let me write a single named and anonymous opaque result type with a plain where clause first, then we can talk about how to actually apply sugar over it.

Slava_Pestov · February 5, 2022, 4:25am

There's really no inherent reason for multi-parameter typeclasses to look like "generic protocols" with angle brackets after a protocol name. I still haven't seen anyone explain why instantiating protocol types with concrete arguments on the right hand side of a conformance requirement is a desirable feature, or what semantics it would have.

If you're looking for arbitrary where clauses on named opaque result types, please feel free to write up a pitch; the feature is 80% implemented behind an experimental flag on the main branch already.

DevAndArtist · February 5, 2022, 4:33am

I'm sorry I'm not a compiler developer, I wouldn't be able to finish even 1% of the remaining 20%. I find it unpleasant to be rolled over with a "if you want something else, do it / the rest yourself" counterargument. Have a great rest of your day.

ksluder · February 5, 2022, 5:36am

Douglas_Gregor:

We have to decide what type to privilege as Self , because all of the members of the protocol are on that Self . What if you want both the initializer (on the "to" type) and the asOther operation (on the "from" type), how would you express that? Joe's straw man syntax
protocol Convertible(from: T, to: U) {
  // ... 
}
would make it clear that both the "from" nor the "to" type are on equal footing, with neither being functionally dependent on the other, which is the right semantics for this protocol.

This argument feels similarly structured to the argument that OOP-style methods shouldn’t exist because they awkwardly privilege one type in the defintion of equal(to:), and languages should use multimethods instead. I don’t find that form of argument to be very strong.

How did we get from a distinct interpretation of multiple-conformance back to a single conformance with an associated type constraint? Are you intending to imply that interpretation #3 is in fact equivalent to interpretation #1?

Douglas_Gregor · February 5, 2022, 6:50am

The "forcing" and the "sneaks" and the "takes away" come across as frantic and antagonistic. They are neither the tone we would like to set on these forums, nor are they helping your argument.

Fundamentally, your complaint is that this proposal is not syntactic sugar because, for opaque result types, it expresses something new: the ability to provide constraints on the associated types of the opaque result type.

As far as I can tell, you aren't disagreeing with that new feature, i.e., you agree that it is useful to be able to express a result type that is "some Sequence where the Element type is String". Your disagreement seems to come in two parts:

You don't like taking the syntax Sequence<String> for this purpose, because it prevents us from using that syntax for something else in the future.
You want the ability to express a complete where clause for the opaque type, rather than this restricted form.

Regarding (1), I don't actually think we want to use this syntax for the other things that "generic protocols" could mean. Here's my attempt at enumerating those things and why they should be spelled differently. It's not enough to say that generic protocols might need this syntax later; you actually need to make a strong case that this specific syntax is the best syntax for that future feature.

Regarding (2), the full "reverse generics" feature has an explicit type parameter list on the right-hand side of the ->. For example, let's write an "unzip" of sorts:

func unzip<C: Collection, T, U>(_ collection: C) -> <R1: Collection, R2: Collection> (R1, R2)
  where C.Element == (T, U), R1.Element == T, R2.Element == U { ... }

In other words, pass in a collection whose element type is (T, U) and get two collections back, one with the T's and one with the U's. With this proposal (and SE-0341), this can be expressed as:

func unzip<T, U>(_ collection: Collection<(T, U)>) -> (some Collection<T>, some Collection<U>)

That is so much clearer. The reverse-generics formulation isn't just more cluttered, it's forcing you to actively reason about both generic parameter lists and detangle the where clause to understand which bits affect the generic parameters left of the -> and which affect generic parameters to the right of the ->.

Reverse generics are a good conceptual underpinning for opaque result types that precisely matches the implementation model. Indeed, they are implemented in the compiler behind an experimental flag so we could test out all of the complicated combinations and internally desugar this pitch to that implementation. However, it is not at all clear to me that we ever want to surface the full reverse-generics models to users: you have to go very deep into the theory for the reverse-generics desugaring of this pitch to make more sense than other more-accessible ways of understanding opaque result types. This pitch covers the most common cases in a manner that we can teach.

If we did eventually get some other way to do more arbitrary where clauses, e.g., this suggestion:

then that would likely cover the expressivity gap. But I would say that this typealias solution by itself is not good enough to replace this pitch. Would we create Of-suffixed typealias versions of all of the collection protocols in the standard library? C++ did this with their type traits, from original class templates (is_same), to value forms (is_same_v) and finally concept forms (same_as), and the result is an awful mess, bloating the API with 3 names for each idea. We should not knowingly go down the same path.

If supporting an arbitrary where clause is important to be able to express in the language, then that feature needs supporting examples. And if the argument is that the need for an arbitrary where clause is so great that we should block progress on this particular pitch... then it needs to demonstrate that this pitch is going in the wrong direction, rather than just that this pitch isn't going far enough.

Doug

Douglas_Gregor · February 5, 2022, 6:52am

In Swift, we write equal(to:) as == in part because we don't want to privilege one particular type, so I'm not sure where your argument leads.

We didn't. If you allow multiple conformances, you're going to want a convenient way to talk about a specific one of those conformances, which is what this pitch does.

Doug

ksluder · February 5, 2022, 7:26am

Spelling it == doesn’t avoid having to choose between func ==(lhs: A, rhs: B) and func ==(lhs: B, rhs: A). Swift still privileges the type of the left-hand side, so you have to write definitions of both functions. Some people consider this redundancy a damning indictment of languages without multimethods.

The argument that one “really” wants “multi-Self protocols” feels like the same argument. It’s also a straw man, because Rust doesn’t implement conversions using a trait with 2 generic parameters. It models conversions using two separate traits, From<T> and Into<T>. A single generic impl<T, U> Into<U> for T provides the Into that mirrors any user-defined impl From.

That, @Slava_Pestov, is the usefulness of generic traits—though perhaps you could call it generic impls, because the arity of the impl does not match that of the trait! But even though impl Into has two type parameters, it’s very clear which is Self, because the impl is for one of them.

Edit: Maybe conditional conformances provide the equivalent functionality to this use of generic impls?

Tino · February 5, 2022, 8:51am

I don't want to waste time and energy by repeating arguments, but just to give a data point for those fighting for a cause that seems lost already: Nothing in the whole discussion has shifted my original evaluation.
The change may add some convenience for a small group, but the increase in complexity will unavoidably confuse and irritate novices ("should that associated type be primary?", "this looks like generics, why does it behave differently?"...).

The current iteration does more than allowing some minor syntactic sugar, but it still feels like a workaround for problems which have not been considered in the design of opaque result type or the "where"-clauses (if those are really such a burden, maybe they should be replaced completely?).

ksluder · February 5, 2022, 6:49pm

Let’s not discount the readability win here. Collection<Character> is easier to understand, especially if you don’t have an understanding of how associated types differ from generics, which is a fairly advanced topic that people commenting in this thread are more likely to have a pretty good handle on!

The concerns about taking away the most obvious syntax for generic protocols warrant more investigation into whether generic protocols are actually useful and whether their relationship to associated types makes spelling them similarly a source of potential confusion. From looking over the Rust examples I think the answer might in fact revolve around another feature we’ve punted on for a while: explicit specialization. I have to follow that mental trail later.

benrimmington · February 5, 2022, 9:01pm

What happens when an associated type is inherited from another protocol?

The various collection protocols override their Element type, to support associated type inference (according to FIXME comments).

public protocol Sequence<Element> {
  associatedtype Element
}

public protocol Collection<Element>: Sequence {
  override associatedtype Element
}

public protocol BidirectionalCollection<Element>: Collection {
  override associatedtype Element
}

There's a similar example in the SIMD protocols, except there's no override redeclaration.

public protocol SIMDStorage<Scalar> {
  associatedtype Scalar: Codable, Hashable
}

public protocol SIMD<Scalar>: SIMDStorage {}

My suggestion is to keep the associatedtype declarations, and then reference them within the angle brackets (as shown above).

For source compatibility, could you also allow conditional compilation of the angle brackets?

public protocol Sequence
#if compiler(>=9999)
<Element>
#endif
{
  associatedtype Element
}

Slava_Pestov · February 5, 2022, 10:08pm

Re-stating an associated type with the same name is a no-op, except if the new associated type has additional requirements in its inheritance clause or where clause. So this is fine:

protocol Collection<Element> {}
protocol BidirectionalCollection<Element> : Collection {}

The way I'm imagining it is if the sub-protocol doesn't declare a primary associated type, then the primary associated type is not inherited; so if you instead write

protocol BidirectionalCollection : Collection {}

You wouldn't be able to say BidirectionalCollection<String>.

Hmm... I kind of like this idea; it feels more consistent in a way, and solves this problem where primary associated type declarations have a source range outside of the body of the protocol, which avoids a new special case for tooling to deal with.

My only concern is that then it's not clear if we want to allow writing an inheritance clause on the primary associated type name itself. Eg, is this valid?

protocol Foo<T : Equatable> {
  associatedtype T
}

Or should all requirements be stated on the associated type declaration itself then?

What do you think?

Unfortunately, this will itself be a new feature so it won't enable source compatibility with older compilers. It's really unfortunate that if is not allowed in more positions; in my opinion the C preprocessor model is too lax, but we could require that if can wrap any syntactically-valid element of the AST (so braces must match, etc).

ksluder · February 6, 2022, 12:06am

I couldn’t help drafting up another example of how generic protocols could be useful. In this case, CollidesWith<T> is a generic protocol, and a generic extension implements weapon-collision logic for all entities that have hit points:

//MARK: Collision Detection

/// A generic protocol that describes how one kind of thing reacts to colliding with another kind of thing.
protocol CollidesWith<Other> {
    var position: Vec3 { get, private(set) }
    var velocity: Vec3 { get, private(set) }
    func handleCollision(with other: Other)
}

/// If T: CollidesWith<U>, then implicitly U: CollidesWith<T>.
extension<T, U> T: CollidesWith<U> where U: CollidesWith<T> {
    func handleCollision(with other: U) {
        // Nothing happens by default.
        // Can be refined by conformers.
    }
}

// Entry point for collision system.
extension CollidesWith<Other> {
    func collide(with other: Other) {
        handleCollision(with: other)
        Other.handleCollision(with: self)
    }
}

/// All the Entities that can collide with each other.
var objects: [any CollidesWith]

func physicsTick(deltaT: Int) {
    // Integrate object velocities and test for collisions.
    let collisions = gatherIntersections(among: objects, over: deltaT)
    
    for (object1, object2) in collisions {
        // This single call will handle both aspects of the collision, even if one object doesn’t react.
        object1.collide(with: object2)
    }
    
    // Apply (potentially modified) velocity.
    for object in objects {
        object.integratePosition(over: deltaT)
    }
}

//MARK: Game Specific Logic

// The world has a ground plane.
struct Ground {
    let elevation: Float
}

// Things can’t fall past the ground.
extension CollidesWith<Ground> {
    func handleCollision(with ground: Ground) {
        velocity.z = 0
    }
}

// A Weapon is anything that can cause damage.
struct Weapon {
    var damage: Int
}

// A protocol for anything that can be damaged or healed.
protocol HasHitPoints {
    var hp: Int { get, private(set) }
}

// When a weapon strikes, it causes damage.
// This is a generic extension, implemented on all types that conform to CollidesWith<Weapon>.
extension<T: HasHitPoints> T where T: CollidesWith<Weapon> {
    func handleCollision(with weapon: Weapon) {
        hp -= weapon.damage
    }
}

// A player loses when they run out of hit points.
// Notice how struct Player _only_ implements the game-over condition; weapon handling is handled by the generic extension above, but the hp setter is kept private.
struct Player: HasHitPoints, CollidesWith<Weapon>, CollidesWith<Ground> {
    private var _hp: Int
    var hp {
        get { _hp }
        private(set) {
            // No resurrecting dead players!
            if (_hp >= 0) {
                _hp = newValue
            }
        }
        didSet {
            if _hp <= 0 {
                gameOver()
            }
        }
    }
}

zwaldowski · February 6, 2022, 12:57am

I agree in principle that Foo<Bar> etc. is probably the most readable sugar for what people want right now, but I have a few misgivings about the effects of how we get there. There's definitely a few things that rub me the wrong way of modifying the declaration site of a protocol to support a sugar.

What if the author of an API and its consumer differ in their imagination for the use cases of an opaque return type? (Or, more realistically, an author doesn't get around to it.) For example, if we decide not to pull Collection.Index into the type parameter list, what is a developer's recourse when they actually do want to use it that way? Needing to define an order of the PAT type parameters, and being limited with adding or removing ones, saddles PATs with the same things people dislike about the current syntax for generics.

I'm becoming less confident that we will be able to avoid one of:

being able to assign a name or sigil for the opaque return type to use with traditional where clauses
a variant of a where clause that binds to opaque types

If we come up with a satisfactory solution to that problem space, it's possible there may be no additional sugar needed, and the so-called lightweight syntaxes get to stay "lightweight". (There's also the idea that a solution to that problem space needs to happen, w.r.t. "sugar for something we don't have another syntax for" problem.)

Slava_Pestov · February 6, 2022, 4:00am

I'm totally onboard with eventually introducing named opaque result types, eg

func foo() -> <T, U> (T, U) where T : Sequence, T : Sequence, T.Element == U.Element

However I think that should be a separate discussion. Even if named opaque result types are introduced I still believe that the syntax in this pitch will be the more common case by far.