A new idea about generics

jeremyabannister · June 3, 2021, 11:21pm

I posted this on the "Improving the UI of generics" thread, but I wanted to also post it separately to hear some independent discussion on the matter without either bumping or clogging that thread:

I have an idea that I quite like so far but I could be convinced otherwise - I would love to hear what people think about it. I know that the idea is inspired by many ideas of others that I've read here on the forum, but I don't remember seeing exactly this as I'm proposing it. However, it is also possible that without knowing it I'm proudly presenting someone else's idea as if it were my own. Apologies if so.

The Idea

What if we expand the use of the generic <T: Constraint> syntax to be usable in every (or maybe almost every) situation where a normal type name can be used? The syntax would have the same meaning in the new context as it does in its current usage, namely that the type in question will be chosen by the caller (subject to certain constraints).

Simplest example:

// Current syntax
func doNothing <Value> (with value: Value)

// New syntax
func doNothing (with value: <Value>)

The Value type is introduced at the same time as being used in the type expression.

Details

The placeholder type names that are introduced in this way are accessible in the whole function signature and within the body of the function just like with the current generic syntax:

// Current syntax
func first <C: Collection> (of collection: C) -> C.Element

// New syntax
func first (of collection: <C: Collection>) -> C.Element

Any type names wrapped in angle brackets must be unique within the scope. This, for example, is an error:

// Wrong
func assign (_ newValue: <Value>, to destination: inout <Value>) // Error - invalid redeclaration of `Value`

Exactly one usage of Value must be wrapped in angle brackets, and everywhere else it is referenced by name like any other type. Generic constraints can be applied either within the angle brackets or by way of a where clause.

Thus, the correct ways to write that function are:

func assign (_ newValue: <Value>, to destination: inout Value)
func assign (_ newValue: Value, to destination: inout <Value>)

func assign <Value> (_ newValue: Value, to destination: inout Value)

If an angle bracket type declaration appears in the return type position that does not mean that it is a reverse generic. It is still a regular generic type, in the sense that the caller chooses the return type.

All of these signatures are equivalent:

// Current syntax
func echo <Value> (_ value: Value) -> Value

// New syntax
func echo (_ value: Value) -> <Value>
func echo (_ value: <Value>) -> Value

The order in which the types are declared within the function signature doesn't matter, in the sense that the declared types can be referenced in earlier parameters:

// Old syntax
func feed <Recipient: Eater> (_ food: Recipient.Food, to recipient: Recipient) -> Recipient.FormOfThanks

// New syntax
func feed (_ food: Recipient.Food, to recipient: <Recipient: Eater>) -> Recipient.FormOfThanks

I find the reduction of angle-bracket-blindness in the latter relative to the former fairly significant.

It seems reasonable to me to allow this syntax to be nested in a type expression:

// Old syntax
func dropLatterHalf <T> (of array: [T]) -> [T]

// New syntax
func dropLatterHalf (of array: [<T>]) -> [T]
func dropLatterHalf (of array: [T]) -> [<T>]

Properties

Given that <T> means a type that will be chosen by the caller, how do we interpret this?:

let foo: <T> = 7

This is effectively the same as this:

typealias T = Int
let foo = 7

in the sense that after using <T> as the type of foo we can then reference T for the rest of the scope:

let foo: <T> = 7
let maximumInteger = T.max // This is `Int.max`

(I can't quite put my finger on it at the moment but I have a feeling that there's something about this use-case that could prove extremely useful for writing and especially for maintaining unit tests).

If there is a constraint included in the type declaration then it is enforced at compile time as always:

let a: <T: Numeric> = 1.4 // Ok
let b: <T: Numeric> = "string" // Error

let c: <T: Numeric>
switch something {
case .oneThing: c = 1.2
case .anotherThing: c = 1.9 // Ok - both are `Double`
}

let d: <T: Numeric>
switch something {
case .oneThing: d = 1.5
case .anotherThing: d = Int(7) // Error: mismatched types
}

This would allow computed properties to have generic return types:

var anyKindOfSevenYouWant: <T: ExpressibleByIntegerLiteral> {
    .init(integerLiteral: 7)
}

Existentials

This syntax would naturally allow us to unwrap existentials. For example:

let boxedUpValue: some Equatable = ...
let value: <Value> = boxedUpValue
if let dynamicallyCasted = someOtherValue as? Value {
    print(value == dynamicallyCasted)
}

I'm thinking where clauses would be allowed on any declaration that contains a type placeholder declaration:

let existential: some Equatable = ...
let value: <T> = existential where T: Equatable

I suppose that in many cases the generic constraint on the type declaration can be implicit:

let existential: some Equatable = ...
let value: <T> = existential // T is known to conform to `Equatable`

Alternative Generic Type Syntax? (Controversial and not to be taken too seriously)

Here's another thought (and this one's a little bit out there) - could using one of these within the type expression of a stored property of a type be interpreted as a new generic parameter of the enclosing type?

struct Queue {

    private(set) var elements: [<Element>]
}

would be equal to:

struct Queue <Element> {

    private(set) var elements: [Element]
}

it could also be done like this:

struct Queue {

    private var _privateDictBecauseWhoKnowsWhy: [Int: <Element>]

    var elements: [Element] {
        Array(_privateDictBecauseWhoKnowsWhy.values)
    }
}

Either way, the Queue type would be usable as a normal generic type (e.g., Queue<Int>).

The Result type for example could then be declared like this:

enum Result {
    case success (<Success>)
    case failure (<Failure: Error>)
}

I suppose the proper order of generic type parameters for a type could be determined simply by the order in which they appear in the type declaration.
This has the order A then B:

struct Foo {
    var a: <A>
    var b: <B: Collection>
}

let _: Foo<Int, Array<Bool>> // Ok
let _: Foo<Array<Bool>, Int> // Error, the Collection must come second

Extensions On Any

Lastly, perhaps this would also be the right syntax for extending any type (if that's actually a good idea in the first place):

extension <T> {
    func mutated (by mutation: (inout Self)->()) -> Self {
        var copy = self
        mutation(&copy)
        return copy
    }
}

One obvious question here is regarding the usage of the letter T, when we could equivalently have written:

extension <AnythingWeWantToWrite> {
    func mutated (by mutation: (inout Self)->()) -> Self {
        var copy = self
        mutation(&copy)
        return copy
    }
}

and achieved the same result. I suppose that the name chosen is nothing more than a typealias for Self which is declared in the scope of that extension, so the name is chosen by the programmer the same way a more descriptive local typealias is chosen by the programmer, as it won't affect anything but the way his or her own code reads. Writing this paragraph then evoked the idea for me, when we don't feel the need for a new typealias T = Self could we write it like this?:

extension <_> {
    func mutated (by mutation: (inout Self)->()) -> Self {
        var copy = self
        mutation(&copy)
        return copy
    }
}

ensan-hcl · June 4, 2021, 12:19am

I don't like to have type parameters scattered all over the function. If we want to have named type parameter that can used anywhere, it's better to declare all of them at first, and shorthand should work only if there is no needs for name, so that they don't cause any additional complexity.

I disagree. What happens in this case? Why not T is used as type parameter of Queue? As the same reason for function type parameter, they should gathered in one place; otherwise it's hard to read (though easy to write).

struct Queue {
    func someFunction(_ value: <T>) {}
}

jeremyabannister:

Given that <T> means a type that will be chosen by the caller, how do we interpret this?:
let foo: <T> = 7
This is effectively the same as this:
typealias T = Int
let foo = 7

I cannot understand what changed here. In the generics manifesto, syntax similar to the former is treated as 'generic constant', it works the same as the latter. I think we should treat these two things as the same feature.

And though there is no type name, the similar feature to what you expected for the former can be already done.

let foo: some Numeric = 7

Therefore, it is a feature that reverse generics should treat. I think it should be like this.

let foo: <some T> = 7

jeremyabannister:

This syntax would naturally allow us to unwrap existentials. For example:

let existential: any Equatable = ...
let otherExistential: any Equatable = ...
let value: <T: Equatable> = existential
if let otherValue = otherExistential as? T {
    if value == otherValue {
        // Do something
    }
}

I agree opening existential is a feature worth adding. But from the same reason, I don't think the syntax let value: <T: Equatable> = existential is apporopriate to do it.

jeremyabannister:

Lastly, perhaps this would also be the right syntax for extending any type (if that's actually a good idea in the first place):
extension <T> {
    func mutated (by mutation: (inout Self)->()) -> Self {
        var copy = self
        mutation(&copy)
        return copy
    }
}

This is referred in generics manifesto and previously discussed as parameterized extension here.

Paul_Cantrell · June 4, 2021, 1:25am

The initial idea (being able to introduce type variables in situ instead of all at the start of the function declaration) is certainly an interesting one. The func f<T> syntax is a confusing one for first-timers. I would be curious to see, in a little informal user study of people new to generics, whether func f(x: <T>) makes any more sense to them.

My guess is it would only help a little, and it would take something more fluent like, say…

func f(x: T) forAnyType T

…to really help. But the value of speculation here is limited; actually showing the syntax alternatives to people and observing their reaction would be illuminating.

Alejandro_Martinez · June 4, 2021, 7:59am

My first impression may just be because is a new syntax, but let me express it anyway :)

This is a nice idea and property to have. It works well in simple examples but I think it muddies the waters for anything more complex.

The first case where the improvements break for me is here:

jeremyabannister:

// Wrong
func assign (_ newValue: <Value>, to destination: inout <Value>) // Error - invalid redeclaration of `Value`
Exactly one usage of Value must be wrapped in angle brackets, and everywhere else it is referenced by name like any other type.

This makes total sense. But I feel that the distinction between Type and <Type> is diluted to the point that the error may seem surprising.

jeremyabannister:

The order in which the types are declared within the function signature doesn't matter, in the sense that the declared types can be referenced in earlier parameters:
// Old syntax
func feed <Recipient: Eater> (_ food: Recipient.Food, to recipient: Recipient) -> Recipient.FormOfThanks

// New syntax
func feed (_ food: Recipient.Food, to recipient: <Recipient: Eater>) -> Recipient.FormOfThanks
I find the reduction of angle-bracket-blindness in the first relative to the second fairly significant.

And here is where I think I feel like we lost any benefit. The nice property of The Value type is introduced at the same time as being used in the type expression. is lost here. It feels very weird to me to start reading a declaration and see a type that is not declared anywhere yet. Maybe is just getting used to it but at this point it feels like we are not gaining much.

As a general note I would also like to point out that having so any different ways of doing the same is not always desirable.

As a side note, this is something I would love Swift to handle sooner rather than later. I wish there was no distinction between nominal and non-nominal types. It gets very frustrating in a lot of cases.

Overall I think is a nice exploration of how we could improve type system syntax. I'm just not sure is an appropriate take on it. But as I said, first impressions!

jeremyabannister · June 4, 2021, 12:38pm

Do you see it any more problematic than this that we already deal with?:

struct Moon {
    var phase: Phase

    enum Phase {
        case new, crescent, full
    }
}

or even:

struct Moon {
    var phase: Phase
}

extension Moon {
    enum Phase {
        case new, crescent, full
    }
}

jeremyabannister · June 4, 2021, 12:51pm

ensan-hcl:

What happens in this case? Why not T is used as type parameter of Queue ? As the same reason for function type parameter, they should gathered in one place; otherwise it's hard to read (though easy to write).
struct Queue {
    func someFunction(_ value: <T>) {}
}

As I indicated, I'm not at all sold on the idea of using this syntax to implicitly declare type parameters. I could imagine it being very problematic...

However, the answer to your specific question is that in your example T would not be treated as a type parameter of Queue because it does not appear in the type of a stored property of Queue.

The reason for this is that while the return type of a computed property or function can be chosen by the caller in the moment of calling (as in the var anyKindOfSevenYouWant example that I gave above), and of course the inputs to functions and subscripts can be chosen by the caller in the moment of calling, in the case of a stored property it only makes sense for the caller to choose the return type at the moment of creating the enclosing type, not in the moment of calling the property. Choosing the return types of some properties of an enclosing type at the moment of creation of the enclosing instance is what a type parameter is.

ensan-hcl · June 4, 2021, 12:58pm

jeremyabannister:

This is effectively the same as this:
typealias T = Int
let foo = 7
in the sense that after using <T> as the type of foo we can then reference T for the rest of the scope:
let foo: <T> = 7
let maximumInteger = T.max // This is `Int.max`

So now, <T> is appeared in stored property. Is it treated as type parameter of Queue? Or, implicitly become Int? How about <U>?

struct Queue {
    let foo: <T> = 7
    let bar: <U>
}

jeremyabannister · June 4, 2021, 12:59pm

This:

desugars to this:

struct Queue <U> {
    typealias T = Int
    let foo: T = 7
    let bar: U
}

ensan-hcl · June 4, 2021, 1:23pm

I see. It's consistent in desugaring. How about this point?

ensan-hcl:

And though there is no type name, the similar feature to what you expected for the former can be already done.
let foo: some Numeric = 7
Therefore, it is a feature that reverse generics should treat. I think it should be like this.
let foo: <some T> = 7

If we don't use reverse generics for this, there would be two really similar way to do the same thing.

(EDIT)

What I want to say is, it is far more readable to use these things like this:

// these two things are equal
let π: <T: ExpressibleByFloatLiteral> = 3.14
var π: <T: ExpressibleByFloatLiteral> {
    return 3.14
}

// these three things are equal
let π: <some T: ExpressibleByFloatLiteral> = 3.14
var π: <some T: ExpressibleByFloatLiteral> {
    return 3.14
}
var π: some ExpressibleByFloatLiteral {
    return 3.14
}

jeremyabannister · June 4, 2021, 3:45pm

Another slightly more indirect way that this change might make generics more accessible to newcomers is that generic parameters will likely have more expressive names when they don't have to be repeated:

func receive (_ input: <Input>)

vs.

func receive <T> (_ input: T)

I remember that for at least the first year of my Swift journey my fleeting interactions with the concept of Generics in Swift left me with the notion that T was some built-in special type name that did... I didn't know. The moment that I understood that T was an arbitrarily chosen demo name and that the generic parameter name was mine to choose I began to understand more technically what generics were and how they worked. In the end I found it quite simple.

How absurd! One main barrier to entry for generics that kept me at arm's length for a time was simply the confusion that was caused for me by the ubiquity of the name T in the mainstream demonstration of generic code.

Paul_Cantrell · June 4, 2021, 3:55pm

Yes, T is really confusing. There’s nothing that stops us from using better name now! In fact, IIRC, getting people to use meaningful names for type parameters has been a personal mission of someone on the Swift core team (Joe Groff, I think maybe?).

jeremyabannister · June 4, 2021, 3:56pm

I've thought a bit more about it, and I have some new understandings, but I still don't have it clear and I don't have time to get it totally clear at the moment.

One main piece of the answer I'm coming to though is that <Name> is not a tool for erasure. I'll give a quick example from current Swift:

struct A <T: Numeric> { }
func foo <T: AdditiveArithmetic> (a: A<T>) {
    // Inside the body of this function T is known to be Numeric
}

What this example demonstrates is that the generic signature of the function does force T to at least conform to AdditiveArithmetic, but it does not "erase" T down to the level of AdditiveArithmetic - T has the maximum level of detail that can be known about it.

Therefore, these two lines do different things:

let foo: <T: Numeric> = 7
let bar: some Numeric = 7

foo is known to be of type Int and has not been erased in any way, while bar, by way of the some keyword, has been meaningfully erased:

func takesInt (_ int: Int) { }
takesInt(foo) // 👍
takesInt(bar) // Error

Note:

In this case the Numeric constraint doesn't "do" anything:

let foo: <T: Numeric> = 7

but it is permitted the same way an explicit type is permitted:

let bar: Double = 1.5

Sometimes it is useful to be able to put an explicit type for the sake of guaranteeing a future compilation error if certain conditions change that you don't expect to change. For the same reason, we might want to add the Numeric constraint just to ensure that in the future we can freely change:

let foo: <T: Numeric> = 7

to:

let foo: <T: Numeric> = 7.1

but not:

let foo: <T: Numeric> = "will cause error"

jeremyabannister · June 4, 2021, 4:03pm

I personally lean toward meaningful type names despite heavier angle-bracket clutter in my own code. Nonetheless, the cost of adding additional characters to our generic function parameter names is higher when the name has to be forward declared:

func feed <Recipient: Eater> (_ recipient: Recipient)

vs:

func feed (_ recipient: <Recipient: Eater>)

The former might sadly tempt someone to write this instead:

func feed <Rcpt: Eater> (_ recipient: Rcpt)

ensan-hcl · June 4, 2021, 4:40pm

Yeah, but this is one of the initial motivations for opaque result type (reverse generics), isn't it?

clients of EightPointedStar could end up relying on its exact return type, making it harder if the author of EightPointedStar wants to change how they implement its shape, such as if a future version of the library provides a generic NPointedStar primitive
(from: SE-0244 Opaque Result Types)

Also, you said takesInt(foo) // 👍 , but it makes free change of type impossible. You can easily depends on the exact type, and so that change of exact type can break existing codes.

jeremyabannister:

Therefore, these two lines do different things:
let foo: <T: Numeric> = 7
let bar: some Numeric = 7
foo is known to be of type Int and has not been erased in any way, while bar , by way of the some keyword, has been meaningfully erased:

Yes, I agree with this. I'm sorry for my exaggeration of 'two really similar way to do the same thing'.

But I didn't find any really useful usage of this. Is there any reason to have shorthand for typealias using glorious <T>? Using it for normal generics makes much more sense and less confusing.

Syre · June 4, 2021, 7:11pm

jeremyabannister:

This:

desugars to this:

struct Queue <U> {
    typealias T = Int
    let foo: T = 7
    let bar: U
}

I think I would be strongly opposed to this, as it would remove the symmetry that we have right now between the actual type and the generic declaration.

If you inspect a variable and see that it's type is Queue<Int>, I think it makes perfect sense that the declaration be struct Queue<T>

That being said, the same symmetry concern doesn't apply to generic functions afaict, so I might be okay with that. However, I feel that there is a usability benefit both in function and type declarations in having a defined and single place to put the generic type declarations.

struct Queue {
    // ... stuff ...

    // Many, many lines into the declaration of Queue I now decide that Queue is generic?
    var elements: [<Element>]
}

This feels unintuitive to me, If I didn't write Queue, and was just examining it with fresh eyes, I think I would want to know up front that Queue is actually Queue<Element>.

bdriscoll · June 4, 2021, 8:12pm

Personally, I am not a fan of this proposal. Among other things:

If you can use a named generic parameter in a signature before introducing it, you lose the intuitive property of being able to understand your code and its lexical scoping rules top-to-bottom and left-to-right.
As brought up earlier in the thread, things get particularly confusing w.r.t. desugaring generic parameters on struct declarations.
It adds a lot of additional syntactic complexity in a way that diverges significantly from other popular imperative languages that feature generics (more for a beginner to learn) for what feels like relatively little convenience.

justinmilo · June 4, 2021, 9:19pm

I like this idea a lot.

I always get angle bracket blindness at the start of the generic signature. This really helps with that.

It reminds me of moving from a language where all variables had to be listed at the start of the scope to one where variables could be declared anywhere within the body.
+1

jayton · June 5, 2021, 6:56am

Generics are analogous to functions at the type level; this sounds like a function where arguments can be declared in the body.

func add5() -> Int {
    return <x> + 5 // x is an argument whose type is inferred. Yay?
}

sveinhal · June 5, 2021, 8:51am

While I think this idea might have merit, I am strongly against it for the simple reason that we already have a generics syntax, and I do not want proliferation of new orthogonal syntax for existing features.

Finagolfin · June 5, 2021, 5:01pm

Thanks for raising the issue, I've always felt Swift made a mistake by adopting the hackneyed angle bracket syntax (does it have a way out of the numerous issues caused by this in C++? I haven't looked). In fact, the Zig approach of just making generics a subset of compile-time meta-programming (skip down to the section on generic data structures to see how it works) seems a much cleaner approach, unburdened by the past (I'm not a compiler engineer and don't know if that approach would limit generics in some way, would think not).

Of course, Swift has no compile-time meta-programming, even though a limited version has been mooted. I can understand why it has been delayed, as understandability and tooling of such metaprogramming in every other language is a mess, but at some point you have to buckle down and add it, as even C has a botched approach with its preprocessor.

I recommend this 2019 cross-language overview to anyone interested in the topic.