[Discussion] Eliding `some` in Swift 6

tem · June 12, 2022, 9:43am

That would be equivalent to <T : P>(T) -> () and that's not possible (generic closures).

GreatApe · June 12, 2022, 11:15am

Ah yes, I remembered that after I posted. But that could be made to work I suppose? Never understood why I doesn’t.

anon9791410 · June 12, 2022, 1:30pm

As much as I don't like the removal of that feature, the decision, mirrored with capture lists, demonstrates that the language designers understood what people want nearly all of the time (let).

Implicit some demonstrates the same. But nobody is suggesting an annoying verbose workaround when choosing any like we do when choosing var.

final class C {
  var property = 0
  
  func function() -> Int {
    let renamed = property,
        property = property,
        betterNameThanFive = 5

    return property + betterNameThanFive
  }

  var closure: () -> Int { {
    // The braces can only impose an implicit `let`,
    // matching the function above.
    // `var` is not an option.
    // `property` instead of `property = property` is appreciated though. 🙂
    [ renamed = property,
      property,
      betterNameThanFive = 5
    ] in

    property + betterNameThanFive
  } }
}

toph42 · June 12, 2022, 3:00pm

I’m not sure I understand what’s going on here (and I didn’t know you could do this in capture lists—that’s cool) but wouldn’t the compiler suggest your renamed = property be changed to _ = property because renamed is never used?

ksluder · June 12, 2022, 3:12pm

“Baffling” seems like a strong word. It will be unfamiliar, and different from every other language with interfaces I can think of, but the fixit will probably be very clear: change the return type to any Collection. That’s going to be a required change in Swift 6 regardless of this discussion.

GreatApe · June 12, 2022, 3:56pm

Well that is sort of my point. It's not clear that they want any Collection, it could also be that they do want some Collection, but didn't realize there was this limitation. I think we all experienced that in the first days of SwiftUI.

But I have to ask again, what was the point of introducing some only to immediately elide it?

David_Ungar2 · June 12, 2022, 4:17pm

I would bring the tooling into the discussion. For instance, could Xcode statically analyze for the use-cases you mention, and on-demand supply a recommendation with an explanation? Maybe as a completion suggestion? How would such a facility affect the tradeoffs for the various language proposals?

Ben_Cohen · June 12, 2022, 5:08pm

I'm sorry, I used an overly understated word. I meant they are inappropriate analogies – I don't think they demonstrate the point you are looking to make.

These keywords are there to mark the potential impact to control flow of the function being called – either by potentially exiting in the case of try, or suspending execution and allowing re-entry in the case of await. They appear at each point because that is important information at each point (though there is a compromise that this is only needed once on a single line when multiple places in a statement might need marking). The repetition does not have some educative benefit.

It is true that sometimes they mark points where this impact to control flow cannot matter. I think it would be very interesting to explore being able to elide then in single-statement function bodies, for example (a discussion for another thread :). But in the general case, they are needed every time because it is impractical to only require them when they matter.

This is very different from this discussion, where some does not have an equivalent situation – some is not marking "beware, your choice here has significant consequences". By taking a generic argument, you are not closing off any important avenues of expressivity for either the caller or the implementer. Whereas by taking any, you are.

The arguments being made here are mainly that requiring some teaches the user of the distinction. That is not what try and await are for, and neither should it be what some is for.

Joe discusses this when he talks about the limitations of value-level abstraction in the Improving the UI of generics post.

Receiving or returning a generic argument preserves type information. If you take some Collection, you have access to the types representing things like the Index and Iterator that any Collection cannot provide. Over time, we may thin away at this advantage through more opening even in open-coded use, but no amount of compiler cleverness will eliminate the difference completely.

Taking a generic argument is near-indistinguishable* to the caller from taking an existential argument. But within the implementation of the function, it provides many benefits. This is why, for function arguments at the very least, it's the clear favorite to be the default, and not just a peer.

The benefits are more mixed for function return types and storage declarations. Still, I think the benefits of some being the default, and any being the one that needs explicit marking. My hope would be we agree at least for function arguments, and then move on to talk about those other cases.

* I know you can write code where the distinction is observable; but those examples are not important enough to be material to the discussion IMO.

Ben_Cohen · June 12, 2022, 5:18pm

I responded to a similar point above. The function author will need to learn distinction between some P and any P either way. The great benefit of learning it at the point when they try to return two different types is that they now have some context for why the distinction is important. When defining a function they will not have this context. They will need to be taught the distinction between returning any and some before it matters, and the distinction will likely seem footling to them. Just more nags from the Swift compiler.

Once they are actually returning two types, the compiler can give a much more clear description of the problem, as well as offer options for how to resolve it, either by changing the function signature, or by returning the same type at both places. This is both better from an education point of view and, in the vastly more common cases where the distinction is not important, from a ceremony-reduction point of view.

That said, even without the benefit of some elision being better for learning, we should be very wary of baking learning goals into the language by way of increased ceremony. Optimizing for the first time, but not the thousands of subsequence times, a feature is used has significant negative consequences for most Swift developers.

The proposal for introducting existential any describes why:

The new any syntax will be staged in over several major Swift releases. In the release where any is introduced, the compiler will not emit any warnings for the lack of any on existential types. After any is introduced, warnings will be added to guide programmers toward the new syntax. Finally, these warnings can become errors, or plain protocol names can be repurposed, in Swift 6.

Since in earlier versions of Swift, plain protocol names already have a meaning, this must be a multi-step process. Hence we need to discuss two things: what would the ideal meaning of bare protocols be, and if the answer is some P, how do we get there (with a side order of is getting there worth it if getting there breaks source).

GreatApe · June 12, 2022, 5:36pm

I don't actually think this makes much sense. On the one hand, you speak about there being some and any and people need to learn the distinction, which by the way I think is a lot easier if that distinction is explicit. If some is not spelled out it's pretty hard to learn, it won't be reified in the learner's mind, and they can't even google it.

I mean, new learners will not think of it as "elided some", for them it will just be "using protocols", and for some obscure reason the compiler sometimes forces them to add the word some. I submit that it's far less confusing and easier to understand if that distinction is made explicit and ever present. Just like the difference between existential P and conforming to P confused me for years.

And then on the other you're saying that mostly the distinction is not important:

I can't think of any cases at all where it isn't, at least conceptually. A box that contains a type erased instance is very different from a specific instance that conforms to a protocol. And we're in the middle of mandating the any spelling because it was considered so important. How can that importance have faded so quickly, before the transition to mandatory any is even complete?

On the other hand, if you mean that the distinction is not important because it will compile either way, just with a different meaning, then the arguments about "implicitly doing the right thing" fall flat, since you're then saying that it doesn't matter if people use any or some.

Ben_Cohen · June 12, 2022, 6:01pm

I am not saying the distinction is unimportant. I am saying that in almost all cases, there is a distinction where one is clearly preferable, and should therefore be the default.

You're making a claim about how people learn that I don't think holds up. People learn by doing. They do not learn the meaning of syntax by being told by the compiler to put it in. They learn by the impact of putting in that syntax. In the case of being required to specify any or some every time, I think they will add these keywords to appease the compiler without gaining any knowledge of their meaning. Whereas if they are required to insert any only in the relatively rare cases when it matters to their code, they will have some basis for understanding the distinction.

This is assuming we want to preserve ceremony in code purely as an (ineffective) teaching device. Even if the device were effective, it would be a bad move to do so.

Jumhyn · June 12, 2022, 6:14pm

The version of the argument I find most compelling goes something like this:

Being able to treat protocols just like any other type would be very nice—users, particularly novice users, will likely instinctively write bare P as a first attempt as using a protocol as a type, and it would be nice if this worked out of the box
When bare P meant any P, the language was regrettably leading users down a design path (due to (1)) that had non-obvious consequences, and it was difficult to go back and change the decision to use existentials after they had been baked into many APIs.
- Even with features like auto-opening, the language will not be able to just paper over all the distinctions between existentials and concrete types, as Ben mentions, and these distinctions matter.
- This is the motivation for getting rid of the bare P <-> any P equivalence.
The difference between some P and any other 'normal' concrete type, is much smaller than the difference between any P and any other type.
- IOW, if bare P means some P, users who use the default are less likely to run into issues down the line than they would be with any P, and when they do, those issues will matter less and be more easily fixable.
If we force users to specify some P or any P explicitly up front, we will be introducing a fair amount of friction in cases where the user could be blissfully ignorant of the fact that they're 'really' writing generic code.
- Further, it will be difficult to explain to the user the distinction between some P and any P at the point of definition, before they have run into a concrete case where the difference matters.
So, because of (1) and (4), we'd like the 'bare' protocol syntax to have some meaning. (2) helped us clear up the syntax space for something other than existentials, and because of (3) we are not concerned that using the bare syntax for some P will have the same issues as any P.

Now, there's various assumptions here that would be good to validate empirically as we explore this direction, and I'm not totally satisfied that this argument sufficiently addresses concerns on the reading side of things, i.e., is it important when reading an API to understand up front that you're dealing with generic code as opposed to a concrete function? I'm not sure. But I think it's broadly true that some P is closer to 'just a normal type' than any P.

Moximillian · June 12, 2022, 6:21pm

It’s very convenient for you to conflate the topic in to a duality, either some or any, when in fact there are three different things, bare protocol (as constraint), some and any.

Sooner or later developers will have to deal with all those three things, and the syntax hiding that by having different things appearing as the same is the source of all the confusion in the pre-any world (including 5.7 which does not explicitly require any). We can get rid of that confusion by having both explicit any and explicit some.

Jumhyn · June 12, 2022, 6:30pm

Currently, there's no type that can represent "a generic function with type parameter T". I do wonder if we could extend implicit existential opening for function types though, where something like:

func f(_: some P) {}
func g(_ takesAnyP: (any P) -> Void) {}

g(f)

would generate a thunk:

g({ anyP in
  f(anyP) // implicitly open the existential
})

and allow things to 'just work'.

I think what's being missed here is that the distance from 'bare P as constraint' to some P (if it exists at all) is much smaller than the distance to any P. After all, some P just expresses 'some type constrained to P'.

GreatApe · June 12, 2022, 6:38pm

This is a fairly compelling argument for why it's better to let bare P mean some P than any P, but I haven't really seen any argument for why it has to mean either.

Sort of, but there are also counterpoints. An existential is a specific concrete type, albeit a boxed type erased type, but some can be different types over time.

This whole thing seems to me like an attempt to make Swift look simpler, without actually being simpler. func foo(x: some P) -> some Q is conceptually quite different from func foo(x: Int) -> Bool so I think it's simply a bad idea to use the same syntax. The only argument so far seems to be that beginners will try to write func foo(x: P) -> Q so we should make that work, and "do the right thing". I really don't see why. It's a different thing, it behaves differently from actual concrete types, so why make it look the same?

I mean how many users are advanced enough to want to write generic code, but not sophisticated enough to know that they are doing that? Seems like a small intersection in the Venn diagram, and definitely not one we should optimize for.

Moximillian · June 12, 2022, 6:41pm

I see that differently. If we look at a codebase with a protocol P defined, used in various functions, used in associated types, extending P,… There’s lot of places where its 1) used as a type, 2) is a constraint, not a type.

Now, maybe you could get away with thinking that P is always a type, but then you still would get weird situations like bare P not being able to store anything in itself, or extension of P not working or functioning as you thought. Also where clauses are something that are easier to understand when constraints and types are clearly distinct.

If syntax clearly separates constraints and ”as types” from each other, then there’s no confusion on how to use them.

GreatApe · June 12, 2022, 6:44pm

The relative similarity is one thing, but speaking in absolute terms, P is a protocol and is used in constructions such as X: P, "X conforms to P". But with this proposal it would also take on the meaning of "some specific, fixed but unnamed type that conforms to P", which clearly is quite a different thing.

Ben_Cohen · June 12, 2022, 7:02pm

Indeed. Especially considering that you can now use some with primary associated types to write some Collection<some BinaryInteger> to mean T: Collection where T.Element: BinaryInteger. With some elision, this would become the far more natural Collection<BinaryInteger>.

And this further demonstrates that any is the exception, not a peer of some. Collection<any BinaryInteger> is a deeply uninteresting and inadvisable type.

xwu · June 12, 2022, 7:13pm

I've been an advocate in the past for adopting the "How do we teach this?" question in Rust RFCs as part of Swift Evolution proposals; if we'd had this section incorporated into our template, we'd have had this discussion already in one of the preceding proposals. But, instead of speaking in hypotheticals, let's consider the question concretely here:

Here's the existing opening paragraph in TSPL on protocols—which, as a reminder, is taught one chapter before generics:

A protocol defines a blueprint of methods, properties, and other requirements that suit a particular task or piece of functionality. The protocol can then be adopted by a class, structure, or enumeration to provide an actual implementation of those requirements. Any type that satisfies the requirements of a protocol is said to conform to that protocol.

...and here's the opening section about existentials:

Protocols as Types

Protocols don’t actually implement any functionality themselves. Nonetheless, you can use protocols as a fully fledged types in your code.

Here's some text (which I freely donate to a future version TSPL or any other didactic material) that can fit right in:

Every value of protocol type has an underlying concrete type—a class, structure, or enumeration that conforms to the protocol. Therefore, when you use a protocol as the type of a variable, you also specify whether that variable will always store values of some fixed underlying type or whether it can store values of any underlying type:
protocol Animal {
  static var species: String { get }
}
struct Cow { static var species: String = "Bos taurus" }
struct Dog { static var species: String = "Canis familiaris" }

var animal1: any Animal = Cow()
animal1 = Dog()

var animal2: some Animal = Cow()
animal2 = Dog() // error: cannot assign value of type 'Dog' to type 'some Animal'
You can use protocols as types in many places where concrete types are allowed, including as the type of items in an array, dictionary or other container. For instance, you can declare a value of type [some Animal], where each element of the array is of the same underlying type, or a value of type [any Animal], where each element of the array can have a different underlying type.

An any type is sometimes called an existential type, which comes from the phrase “there exists a type T such that T conforms to the protocol”. Unlike a some type, an existential type doesn't always provide all the same methods and properties that are guaranteed by the corresponding protocol.

[Subsection: discuss uses and limitations of existential types]
[Subsection: discuss automatic conversions between some and any types]

It seems to me that this presentation of the distinction between some and any isn't very hard to grasp, even while limiting our discussion by never mentioning generic parameters.

Moximillian · June 12, 2022, 8:39pm

Completely agree that teaching explicit some and explicit any is very straightforward and simple to teach just by saying "some" allows you to store/use only same stuff and "any" allows you to store/use different kinds of stuff. And then go deeper into where and when you'd want to use one or the other. When you have explicit syntax visible, it's easy to explain it by referring to that syntax.

When you have invisible syntax you need to start explaining why protocol isn't just a protocol in some places, and that really hinders learning the fundamental concepts that swift is built upon. When developer lacks this understanding of fundamentals, the programming just becomes a game of whack-a-mole. Reacting to errors, trying various words from syntax to "appease" the compiler, all while not understanding why or how any of that is solving the issues.