[Pitch 2] Light-weight same-type requirement syntax

sveinhal · February 20, 2022, 5:03pm

Ah. Disregard my comment ^

gwendal.roue · February 20, 2022, 5:04pm

Yes, it looks like some Publisher<Int, Never> is included in the pitch. That's what I'm inferring now.

Ben_Cohen · February 20, 2022, 5:12pm

The challenge here is that this proposal is introducing two things: new language expressivity in the case of result types, and sugar for a common use case for parameter types.

They both rely on the same syntax, and as such it's appropriate to introduce them through the same proposal IMO. But it means the discussion needs to avoid crossing the streams when using arguments for or against the two parts.

In the case of the sugar, it is all about the benefit the sugar brings, as everything you can write with it you can write today. You can argue that the sugar is entirely unmerited, or you can argue that the sugar doesn't go far enough and other things should also benefit from that sugar. It comes down to judgement calls, feelings about aesthetics and how common particular use cases are. In your example, you are outlining a big leap from one form to another, and (I think) using that to say that Collection<Int> shouldn't be so well sugared, and should instead be Collection<.Element == Int> because then it's less of a leap to Collection<.Element == Int, Index: Hashable>. Is that the gist of it?

When talking about generic result types, we have a very different discussion because we are now talking not just about sugar, but about increasing expressivity of opaque result types – allowing you to return opaque types you cannot return today. So it is not just about the aesthetics and ergonomics of the sugar.

It is in this context that I am laying out my claim: opaque result types with arbitrary constraints are probably so marginally useful that it is a very low priority to add them to the language. By contrast, being able to return an opaque type constrained by a primary associated type such as Collection<Element> is a clear and immediate need for anyone producing source- or ABI-stable SDKs or packages (or even private frameworks inside an app, when many developers are working on its codebase), so should be assigned a high priority to unblock those developers.

As @jayton says above:

(I would be very interested in knowing if it is most or if it is actually all... the introduction of compelling use cases might shift my view of the priorities)

This does not rule out the addition of arbitrary constraints in the future. But the syntax for them is going to be tricky (both the leading dot and the named placeholder syntax have signficant downsides), and I don't believe working on this should block delivering the feature that does have a clear need for many developers.

Chris_Lattner3 · February 20, 2022, 5:42pm

I'm sorry but I don't follow swift evolution and haven't been a part of the core team since middle of last year. I don't have enough context to have an informed opinion here.

-Chris

Karl · February 20, 2022, 5:57pm

I would refer Chris' comment, as quoted by @DevAndArtist, about building a language with bricks vs mortar, which I think is a good analogy.

Expressing it in terms of "doesn't go far enough" or "not common enough" does not fully capture the argument. It's about the principle of building a language from a collection of ad-hoc special cases, and whether or not that actually achieves the stated goal of being easier to learn (I think @rauhul's comment is illustrative):

No. The argument I've made against Collection<Int> and in favour of Collection<.Element == Int> is not just that the former fails to scale to other associated types, but also that it fails to scale to most other protocols.

The post from earlier contains a couple of examples to illustrate it. After the arguments have been laid out in such detail, I can only suggest that you read them. There is no "gist".

Okay, well I disagree. The examples have been laid out multiple times, by multiple contributors:

I don't believe we should allow n 'primary' associated types but not arbitrary constraints. They are almost identical, but with one slight difference: the former requires enormous source churn for library authors, who must now predict when this syntax should be allowed to work, all for the sake of a syntax which is so terse that it quickly becomes non-intelligible. The latter is more compatible, more flexible, and clearer at the point of the use.

Ben_Cohen · February 20, 2022, 6:21pm

See above – this example (as well as others like a DictionaryProtocol for dictionary types) was addressed in the proposal as examples of protocols specifying multiple primary associated types. This was a great example of feedback on the pitch giving a clear and important use case the proposal didn't cater to. The solution of allowing multiple primary associated types fits very neatly into the proposal, which was amended to cover this case.

The dividing line here is between associated types that are about the essence of the protocol (the "primary" associated types... perhaps we can find a better name, though that name doesn't actually appear in the language in the current proposal so does not need to be set in stone), versus associated types that are just part of the implementation mechanics of a protocol.

The primary types are the "essential" types. Element in the case of Collection, the Value and Error types in the case of Publisher, the Key and Value types for DictionaryProtocol, the Scalar value for SIMD. One way to spot these is that they almost always match the generic placeholders for the concrete implementations. So Array<Element>: Collection, Dictionary<Key,Value>: DictionaryProtocol, SIMD3<Scalar>: SIMD.

Then there are "supporting" associated types. Types that need to vary by implementation, and need to be used in the implementation of methods. Collection.Index is the most commonly encountered one. SIMD.MaskStorage would be another. There must be an associated type to link together the type used by startIndex, endIndex, subscript and so on. But for most use cases, it can remain opaque.

My contention is constraints on these "secondary" types, which are less common but do come up when used on parameters, are going to be extremely uncommon, maybe even to the point of almost entirely unused, on opaque result types. It is counter examples to this that I am looking for.

benlings · February 20, 2022, 8:26pm

I know that applying this change to standard library types is in ‘future directions’, but I have a question about whether a particular direction of evolution is feasible.

One particular usability problem I have with generics is this: collection methods often have to return a specialised type from their methods (eg various Iterator structs, Publishers.FlatMap etc in Combine, LazyPrefixWhileSequence and other lazy views). This means that if I want to understand what I can do with the return value, I need to then go to that type and look at what protocols it adopts, then also wonder if there is any additional API it offers on top of these protocols . It looks like this proposal would allow eg flatMap to return some Publisher<P, Failure>, which would be a lot clearer. Is this correct? If it is, this then leads to 2 questions

Will this work from a performance point of view? My understanding is that the compiler is able to better optimise these cases because it knows the types. I’m guessing this isn’t an issue, because SwiftUI uses opaque return types, but would like to check.
Is it possible to migrate from the current situation with nominal return types to opaque return types? I realise that this would be source breaking so would need to wait for Swift 6 - but is it even possible?

Could the same notion of primary (or positional/indexed) types also be used for existentials? Eg have any Collection<Int> instead of (or even replacing) AnyCollection<Int>. I think this also would be helpful in replacing a bunch of nominal types where the names actually work to obscure what is important.

Ben_Cohen · February 20, 2022, 9:45pm

Yes, that's correct.

Mostly yes, it's fine from a performance point of view. It's a little complicated because it's mixed in with other topics, such as whether the type has a resilient ABI (in a framework "built for library evolution") and other things like inlinability, cross-module optimization.

Basically, if the compiler has visibility into the function returning the type, then it knows what it is even if you don't, and it can optimize accordingly. So for example if you use an opaque result type in your own source code, then the compiler can see the function implementation and so knows the real returned type. If you use one returned from an ABI-stable library (like the standard library) then it depends on whether the function has been marked inlinable, allowing the caller's compiler to see what type it is. This is the fairly standard burden on ABI-stable library authors to decide how much flexibility to trade off. If the function is inlinable, this means the author cannot, later, return a different type. If it isn't inlinable, then the type is entirely opaque and has to be manipulated via the witness tables. But this is standard for ABI-stable libraries and goes along other things like whether the type is @frozen, whether other methods are inlinable etc.

For the most part, the standard library tends to mark most stuff frozen and inlinable ("fragile") because performance is critical. But higher-level frameworks tend to skew more in favor of resilience.

It is definitely possible when aligned with a Swift 6 language mode. The ABI consequences can be worked around with various techniques. Whether the source break is worth it should be discussed as part of a future pitch.

I think so, yes! A pitch of this very idea is being worked on now. Such a concept when combined with opening of existentials should indeed render AnyCollection<Int> unnecessary (though since it's ABI, we'll never actually get to delete the 2,500 lines found in ExistentialCollections.swift alas).

benlings · February 20, 2022, 10:44pm

Thanks for the detailed reply - it sounds very positive!

I think the parallel between some Collection<Int> and Array<Int> is a useful one. It was previously the case that if there were angle brackets after a type name, it was a generic type. This is now muddied a little (but still visible with the presence of some). However, I think the benefits to conciseness and not needing to jump back and forth when mentally parsing a declaration is probably worth it - particularly if it’s also possible to move away from unnecessary nominal return types.

ExFalsoQuodlibet · February 21, 2022, 8:34am

I can't argue with that.

I'm honestly getting a little confused about opaque types in general. I might be wrong but I see the use cases of opaque types (parameters, result, structural et cetera) as a sugary lightweight alternative to generic parameters. And I support the pitch also because I find the excellent example, very well put by @Karl

Karl:

func doSomething(_ items: some Collection<Int>)

// Oh no! We need Index to conform to Hashable!
// It's just a simple matter of... oh...

func doSomething<C>(_ items: C) where C: Collection, C.Element == Int, C.Index: Hashable

to not be particularly problematic: if I ever found myself in the position to teach this, I'd do it in 2 steps, first desugar the opaque parameter, then add the additional constraint.

So, if opaque types are a lightweight sugar form of the more complete and powerful generics signature, I'm not concerned about the expressivity limitation related to the usage of the new syntax (introduced by this pitch, together with an expressivity addition) in the return position of a function, because the actual missing feature is reverse generics, and this pitch doesn't preclude future work on that, while being very useful in itself.

John_McCall · February 21, 2022, 11:28am

A post was merged into an existing topic: Core team to form language workgroup

John_McCall · February 21, 2022, 11:22am

Holly has covered this to a significant degree, but let me try to lay out the grand vision for where this is going.

We have an opportunity for a synthesis across several different language features:

We'd like generics to have stronger language connections to other things so that picking up generic programming feels more familiar, with better progressive disclosure of complexity.
We'd like to be able to express more advanced constraints on existential types (protocol and protocol composition types) than what you can do with just &.
We'd like to be able to express more advanced constraints on opaque result types than what you can do with just &.

The synthesis is quite simple. The existential and opaque result type cases require us to add a syntax to constrain the associated types of a protocol or protocol composition. This should, of course, be the same syntax for these two cases, other than the leading any or some keyword. Meanwhile, SE-0341 has introduced opaque parameter types, also written with some P, allowing the generic signature to be completely elided for some simple generic functions. By adopting the same syntax for the opaque-parameter some P as for the opaque-result some P, we can now write fairly advanced generic functions without an explicit generic signature. We only need a signature when there has to be a relationship between different components of the function (like if two collections need to have the same generic element type).

It's fair to ask: why does eliding the generic signature help to achieve the goal of building stronger connections to other parts of the language? Well, Swift has three ways to generalize over different types of values. One of them is subclassing, and that's inherently a limited form of generalization: it only works when you've got classes with a common superclass. The other two are generics and existential types. SE-0341 lets you express simple generics with almost the same syntax as existential types, just varying between any and some. The vision here is to generalize that to any sort of self-contained constraint, so you can you can take that and use it uniformly throughout the language, either as any or some. That is a very strong connection. And building that connection to existential types also makes generics much more familiar to programmers coming to Swift from one of our many peer languages with weak (if any) generic systems, where they're used to working with protocol types because that's the primary tool for generalization.

So that's the vision. It's an ambitious vision, in two main ways.

First, applying this same syntax to each of these features poses different implementation complexity.

For generics, this is pure "sugar" — it can almost be handled in the parser — and so there is no special implementation complexity.
For opaque result types, I believe it's not quite so simple, but it's still fairly straightforward.
For existential types, there's quite a bit of plumbing and generalization that will be required in both the compiler and the runtime.

So the syntax will need to be implemented for each of these cases at different times, unless we're going to hold the whole thing up for a few releases until we've solved all the problems around generalizing existential types.

Second, generality often comes at a cost. Generic signatures are very general, but allowing for that generality makes generic declarations and constraints syntactically very different from everything else.

For example, the way that you constrain the element type of an opaque Collection type (C.Element == Int) is completely different from the way that you constrain the element type of a concrete collection type (Array<Int>). That generality means that code can also constrain the other associated types of Collection just as easily as they can Element, which is obviously necessary. However, it also creates a significant and unfamiliar gap that programmers have to overcome before they can write code that's generic over collections. So if the syntax we introduce for this looks like the contents of a where clause, we'll have achieved generality, but we'll also be missing a major opportunity to build a stronger connection to other things in the language and make generics feel more like a generalization of what programmers are already familiar with. (This is particularly true for collections, since many programmers coming from other languages are familiar with being able to write e.g. Collection<MyType>, and it seems very strange to them that Swift uses exactly this syntax but for some reason not for Collection.)

To make that concrete, it would be great for progressive complexity if you could simply take some existing function that works on a concrete generic type like Array and substitute Array out for some Collection:

func collect(widgets: Array<Item>) {
  for widget in widgets { ... }
}

func collect(widgets: some Collection<Item>) {
  for widget in widgets { ... }
}

So I think the concrete achievement of this vision has to include both:

a fully general syntax that can express any constraint that a generic signature with a single type parameter could and
a syntax that specifically lets you constrain associated types by equality with a given type.

This pitch only addresses (2). Procedurally, I think it's okay for a proposal to carve out a narrow case like that, in the interests of making incremental progress, as long as it doesn't prevent the more general case from being addressed. I don't think that's happening here. I don't know that developing the general syntax is hard in the way that it's been described a few times in this thread, but if both syntaxes are indeed necessary, it's fine to start with the narrower one that has greater immediate impact on the standard library.

I was more worried about this narrower syntax being inadequate in the short term even for common cases until @Joe_Groff helped me realize just how expressive nested some types could be. For example, some Collection<some Comparable> is a perfectly fine way of expressing what otherwise would have been <C: Collection> where C.Element: Comparable.

John_McCall · February 21, 2022, 11:33am

Moderator note: I've moved Chris's post to the language workgroups thread, which I believe is where he wanted to post it, but couldn't because it was locked. Please take any discussion of that over there.

DevAndArtist · February 21, 2022, 12:02pm

An attempt for a potential middle ground. I strongly remain in the position that the <...> should be preserved for parameterized protocols as described in the generics menefesto. That is the reason why I think the golden middle should use a 'marker' on the associated types.

// today

// the order of those associated types becomes important regardless
// if we use a marker or shift everything into angle brakets
// during the protocol declaration
protocol P {
  @marker
  associatedtype A

  @marker
  associatedtype B

  associatedtype C
}

// no primary associated types constrained
some P
// both marked associated are moved up into the angle brackets
some P<ConcreteA, ConcreteB>
// only A is constrained 
some P<ConcreteA, _>
// only B is constrained 
some P<_, ConcreteB>

// `C` is not constrainable with this syntax and would require a different syntax
// e.g.
func foo() -> <T: P> T where T.C == ConcreteC

In the future when we have the chance to revisit 'actual generic protocols' we can still combine those features into the same generic type parameter list like so:

// in the future

protocol Q<A> {
  @marker
  associatedtype B

  @marker
  associatedtype C

  associatedtype D = A
}

// the actual generic type parameter comes first and followed by 
// marked associated types
some Q<ConcreteA, ConcreteB>

// primary associated type unconstrained
some Q<ConcreteA, _, _>
// or if none primary associated types need to be constrained,
// we'd simply slice off tail of that list and only set the necessary
// generic type parameters
some Q<ConcreteA>

func bar() -> <T: Q<ConcreteA>> T where T.D == ConcreteD

That way on the declaration side we won't mix the generic type parameter with the primary associated types.
Tradeoffs we have to take:
- declaration order of primary associated types becomes important
- we will have to use placeholder types in cases we don't want to constraint a primary associated type
- the primary associated type cannot be visible exposed without the generalized feature
- the ability to specify a conformance of an associated type might require the generalized feature in some cases
I do believe that it would be impossible to extend protocols in such manner if the current proposal would move primary associated types into the angle brackets on the declaration side of things. It would require some disambiguation between true generic type parameter and primary associated types and I think we could agree that R</* explicitly */ generic A, /* implicitly primary */ B> or R<A, /* explicitly */ primary B> markers inside the angle brackets wouldn't be ideal especially as we may want to introduce type labels in the future.

Would that be feasible?

bzamayo · February 21, 2022, 1:17pm

IMO this is more confusing than overloading the meaning of <> because you lose the symmetry in the source between declaration and use-site.

xwu · February 21, 2022, 1:21pm

I was very concerned with the use of <…> in the first version of this pitch because I was of the same opinion. However, as I said before, I am much happier with this iteration precisely because it doesn’t go half way.

Either the angle brackets should be reserved for “generic protocols”—even though the Manifesto has stated is unlikely ever to be a part of Swift—and the angle brackets should not be used at either the declaration or use site for anything else; or they should not be reserved for such, and they should be declared similarly to how generic parameters are declared since they are used similarly as well.

I would be sad to see this design revert to the earlier inconsistency. Put another way, I would be concerned that a proposal to use angle brackets for protocols didn’t either adopt them for generic protocols or rule out their use for generic protocols; the idea that they might mean one or the other at some later point is to me the least ideal state of affairs.

DevAndArtist · February 21, 2022, 1:24pm

I would argue that symmetry shouldn't be the primary argument for burning a possible future feature. The main motivation for the proposed syntax in my opinion remains on the use-site and I just tried to present a possible middle ground which in my non-compiler engineer eyes would be a fair trade-off for both parties, the one who want generic protocols and those who want to use primary associated types inside the angle brackets without explicit associated type names.

That said, I probably can somewhat accept the assymetic feature design:

protocol P {
  primary associatedtype A
}

some P // equals `some P<_>`?
some P<ConcreteA>

bzamayo · February 21, 2022, 1:33pm

If you want to keep room for generic protocols with the standard brackets syntax, I would rather have it so you need to write something like Collection<associatedtype Element> to get the light-weight form for the feature in this pitch. How does that look to you?

DevAndArtist · February 21, 2022, 1:42pm

@bzamayo I do not follow. Honest question: How is this an improvement or any kind of disambiguation?

In my mind I could view multiple levels of sugar code:

// no sugar code here at all, we only introduce a single 'primary' marker 
protocol P<A> {
          ^~~ generic type parameter list, not primary assoc

  primary associatedtype B

  // non-primary
  associatedtype C
}

We now have to decide whether or not primary associated types must be always specified or not:

// less light weight (not proposed)
some P<ConcreteA, .B == ConcreteB>

// light-weight: could be sugar for above example
some P<ConcreteA, ConcreteB>

// * is this `some P<ConcreteA, _>` ???
// * or if the first example was a thing: `some P<ConcreteA, .B == _>` 
some P<ConcreteA> 
      ^~~~~~~~~~~ required generic type parameter

bzamayo · February 21, 2022, 1:45pm

Because in the future when generic protocols existed, you wouldn't use the 'associatedtype' prefix to declare a generic parameter.