[Discussion] Eliding `some` in Swift 6

GreatApe · June 12, 2022, 8:41pm

This seems like a non-sequitur to me. That particular combination might be uninteresting, but not all any types are. If they were, why do we have them, and why have they been spelled with the bare protocol for 8 years?

And even if some is better than any, which you seem to be saying, that is not really an argument for using the conformance syntax "bare P" for it.

There might be good reasons for the change proposed here, but it's very far from the no-brainer you seem to suggest.

For example it really isn't obvious that [Collection<Int>] means an array of a specific, unnamed type of collection of Int, rather than a heterogeneous array of possibly different kinds of collections. You might argue that it's better if it means that, but it's far from a slam dunk.

GreatApe · June 12, 2022, 8:44pm

Exactly, that is precisely the situation we just escaped with the help of any. For almost a decade learners and intermediate users alike have been confused by protocols that "don't conform to themselves", and now we want to bring back exactly the same problem. I haven't even seen an argument for why we'd want to even consider this, other than "because we can, since bare P just became vacant".

hborla · June 12, 2022, 8:58pm

It is perfectly okay to not agree with the arguments provided, but I certainly provided much more justification than "because we can" in my original post.

hborla:

Writing any explicitly is important for understanding the type-erasing semantics and the limitations of using this type, including the ability to change the underlying type, the inability to access Self and associated type requirements, etc. These limitations don’t exist for some types, which provide full access to the protocol requirements and extension methods. In fact, some already is the default for protocol extensions, e.g.
// Extends all concrete types conforming to 'Collection'.
// The 'Self' type parameter serves as a placeholder for the
// concrete type conforming to 'Collection'.
extension Collection { ... }
In a protocol extension, you write generic code that operates on a Self type parameter that conforms to the protocol. In this case, the bare protocol name is already sugar for declaring a type parameter conforming to the protocol. Protocol extensions have proven to be a very natural way to write generic code. The same principle could be applied to plain protocol names in Swift 6 to enable programmers to write generic functions naturally without having to fully internalize explicit generic signatures with where clauses.

I added emphasis to the last sentence because it's at the root of why I'm interested in exploring this direction.

@Jumhyn also summarized the arguments very nicely in this post:

Claiming that no arguments have been provided is not helpful to the discussion.

Ben_Cohen · June 12, 2022, 9:08pm

Unfortunately, it's much more complicated than the text above makes out. This treatment suggests that let animal2: some Animal = Cow() somehow fixes the type of animal2 such that you can reassign it another Cow later. In our current implementation, however, that isn't true. Opaque generic types in storage is, in fact, dramatically more complicated than that. And so the claim that it's easy to teach, and that the dual some/any keywords help with that, is off the mark in this case. The some here explains almost nothing to the user. The text could be rephrased to use "by default" rather than some, and it would still be similarly explanatory and have the same flaws.

In practice, explicit protocols like some Animal should only be used in public interfaces, probably in most cases with a computed get rather than a let. In non-public interfaces, you are better off using type inference and not using the protocol explicitly at all.

Which maybe leads to someone thinking "Ah, so that means maybe any is the right default, if some isn't so useful in this case". Storage declarations is certainly the main use of any but it should rarely (maybe never) be used as API (the places the standard library uses it are very regrettable but baked into the ABI now). If you are writing public API, then some (or something else entirely like an associated type or even just a concrete currency type) is still the better choice. In non-public storage declarations, any should still be opt-in, because its trading of loss of type information in exchange for dynamic behavior is something that has consequences the user should be aware of.

All of this is way more detail than is necessary for the kind of level TSPL is aiming at. You rightly point out it is awkward today that protocols are introduced separately to generics, and a rethink here is probably needed. Most importantly, introducing some for the first time in the context of variable declarations, and not generic functions, would be the wrong thing to do whether or not some can be elided.

GreatApe · June 12, 2022, 9:15pm

I read your pitch, including these sections, but really couldn't see what the argument was. The first quote still doesn't seem like an argument either way, it sounds to me like it's talking about the old forbidden existentials, but maybe I'm missing something that's implicit?

I see myself as a pretty pragmatic person, but this is definitely not how I would describe the situation. It's not about "adding any when needed". If you're serious about writing Swift you will want to know what you're doing, not just sprinkle modifiers until it works. We're dealing with very different situations here, I really don't think we should optimize for people who don't care about the difference between a generic type and an existential. So in my world people either want to work with a generic, or they don't.

Yes it "could", but this doesn't provide any argument for why we'd want to go this particular route. Using some already achieves exactly what you describe, people can write generic functions without where clauses.

What I want to understand is why they have to be able to do this using the bare protocol. Reading these quotes here, it still sounds like the only argument is that it will make some people write working code by accident.

For the record, I don't agree that protocol extensions are natural. In fact, while easy to work with, they are probably the part of generics/protocols that have been the biggest source of confusion to me, precisely because they are overly elided. It looks like you are extending the protocol, which I would have assumed means the existential, or something general like that, but in fact you are extending all specific types that conform to the protocol. That took me a long while to understand, I actively used it for years without fully grasping what it really meant.

Writing protocol extension with some would make it a whole lot clearer.

Jumhyn · June 12, 2022, 9:23pm

This feels very close to me for an argument for some elision, rather than against. How much of this confusion might have been alleviated if the existential has always been spelled any P and bare P had always been a synonym for some P?

Moximillian · June 12, 2022, 9:33pm

While I do agree that "some" is probably better default than "any", and that swift syntax has been particularly bad exactly because bare P defaults to any, I do NOT agree that it's good or desirable that (lack of) syntax causes developers to be ignorant of what the code they've written actually does.

For someone who knows the inner workings of swift implementation, it's naturally desirable to remove as much syntax as possible. But for someone who wants to learn and become better developer, lack of syntax is a brick wall.

I agree with @GreatApe that explicit some (or any) in protocol extensions would be much better than the inscrutability of the elided extension syntax.

GreatApe · June 12, 2022, 9:38pm

Well, if you want a situation where people accidentally write code that compiles, without understanding the concepts behind it, then sure.

That is not what I want though, I would have vastly preferred if it was written extension some Collection or extension T: Collection or something like that. The latter would have been self explanatory and unambiguous, the former would have required me to understand some, which I would have to understand anyway, and it's not hard to learn.

I always had a vague idea of what "existential" meant, but the any syntax made it very clear, I think it was a big win.

hborla · June 12, 2022, 9:47pm

I appreciate you taking the time to elaborate! Describing what you don't understand about the given arguments is infinitely more helpful to me, and will facilitate deeper discussion on how programmers internalize code.

I'm not suggesting that programmers sprinkle modifiers until their code works. some and any provide two different kinds of polymorphism. I agree with you that more experienced programmers will typically know what kind of polymorphism they need before writing the code, and my suggestion to "write some by default" definitely isn't meant to be a blanket rule in all situations. However, Swift does generally encourage parametric polymorphism because of it's emphasis on value types and static type safety, and writing code in a generic context is the best way to preserve more type information about your code to enable more mistakes to be caught at compile time. Enabling more expressivity and more mistakes to be caught at compile time is exactly the argument for encouraging programmers to generally prefer some in the cases where both some and any could be made to work.

Because of Swift's core principle of building a language model that is "safe by default" (e.g. type safety, memory safety, and eventually, safe from data races by default), the compiler needs to know a lot more information about source code than other languages. This "safe by default" programming model greatly increases programmers' confidence in the correctness of their code at compile-time, and saves a lot of developer time chasing down bugs at runtime by defining these issues out of the language.

On the other hand, I think one of Swift's biggest criticisms is the compiler "nagging" programmers about things they got wrong in their code that are actually harmless, and having to make minor changes to code "just to appease the compiler". These are barriers to productivity and can take a lot of joy away from writing code in Swift. Even minor amounts of boilerplate ceremony, when piled up, can take time away from implementing more interesting bits of functionality in a program.

If it is indeed true that in Swift, parametric polymorphism is more often the better tool for the job than subtype polymorphism, then some might become just another one of those annoying things you need to write in your code to appease the compiler. I believe that programmers will often forget to write a keyword in front of a protocol name, even in Swift 5.7 and beyond, which is part of the reason why existential types have been so widely adopted today. In the future with explicit some and any required, the workflow is you write a plain protocol name, get an error message (potentially a live issue before you're finished writing that bit of code), then have to take a few seconds to read the error and apply the fix-it to insert some. Now, as a compiler engineer who invests a lot of time in improving diagnostics, I'm actually a big fan of a diagnostic-guided development workflow, but I admit that getting this error several times per day might just be annoying.

All of these reasons are why I'm asking these sorts of questions (which I edited the original post to include after I posed them upthread):

And again, this is just a discussion! I'm not saying that this is 100% the right thing to do, and I want to hear about how programmers experience some and any in their own code. If the some keyword truly is meaningful and clarifying to programmers or eliding it would introduce serious footguns into code, then maybe this isn't the right direction, but that is not inline with the feedback that I have gotten so far since the introduction of some and its increased adoption throughout Swift code.

Moximillian · June 12, 2022, 9:55pm

For what it's worth, to get proper answers to these questions, there first needs to be a swift version in use where "any" is required. In 5.7 it's just optional, so it's almost as bad as all the versions before it.
Direct jump from "no any required" to "elided some" would result in never getting balanced answers to those questions.

tem · June 12, 2022, 9:59pm

Side note: that seems like a bug to me.

When you have a function that returns some Animal:

func f() -> some Animal { Dog() }

var animal: some Animal = f()

animal = f()
// Cannot assign value of type 'some Animal' (result
// of 'f()') to type 'some Animal' (type of 'animal')

Delete : some Animal and the error goes away, even though the inferred type of animal remains some Animal.

Edit: it's not a bug! D'oh!

If I understand correctly, when the type annotation is omitted, the type of animal is #opaqueReturnType(of: f) which Xcode just shows as some Animal. But when you explicitly annotate the variable as some Animal, you're reserving the ability to change the underlying type of animal to something else (independently of any changes to f) without breaking users' code. Thanks to @Jumhyn and @Ben_Cohen for explaining!

xwu · June 12, 2022, 10:02pm

Yes, that's rather misleading due to a think-o on my part. As you know, there are certain reassignments which are possible, and an explanation as to why would go nicely into demonstrating the "opaqueness" of opaque types:

var x: some BinaryInteger = 42
x = 21

protocol Animal { init() }
struct Cow: Animal { }

func f<T: Animal>() -> T { T() }
var a: some Animal = Cow()
a = f()

There are, obviously, a great many things that I didn't get to outlining in a preliminary sketch of how to expose the distinction, and which may not even be appropriate at the level of TSPL at all. However, a user doesn't need to grasp 100% of the advanced use cases of a feature to understand the rationale for why it exists. A visible sigil that sets the feature (opaque types) apart from another (protocols) has its benefits in this respect.

Yes, from a pedagogical standpoint, it may be impossible to construct a satisfactory text that proceeds in the current order: protocols (including existential types) → generics → opaque types. Totally agree that introducing some in the context of variable declarations is not ideal because it misses an opportunity to demonstrate what it is most useful for. It is illustrated in such a way in my post here because, in the current flow of TSPL, any other example would presume knowledge not yet introduced.

Jumhyn · June 12, 2022, 10:15pm

Mm yeah, this is confusing. We don’t even really have the vocabulary to describe why this fails from ‘within’ the language. It might be more understandable if we had the #opaqueReturnType(of: f) available to more directly explain the difference between some P attached to two different declarations.

ETA: though even the straw-syntax #opaqueReturnType(of:) has been problematized by the introduction of structural opaque result types…

GreatApe · June 12, 2022, 10:38pm

Yes I often hear this criticism, probably mostly from old web developers and objective c veterans, but I never understood it. I never did JS, and never liked obj c much, for me Swift's strictness is an advantage 99% of the time. I want the compiler to guide me, but I also want to know what is going on, where it matters. It speeds up the process and minimizes errors. It's never "nagging me about harmless things", it seems to me to always just be preventing me from making mistakes, like passing the wrong type. It's only really around the edges, perhaps with tuples that don't want to splat/unsplat, or with UInts vs Ints that I ever feel that it's overzealous.

Anyway, all this to say that I don't personally see it as a disadvantage if the compiler tells me that a function can't simply return P, it must either return a type that conforms to P, or an "existential P", which by the way I think needs a new name, because it's completely unintuitive.

If we let people write func foo(p: P) without understanding the difference between some P and any P, then they will not be able to explain exactly what that function expects. This is a certainty, so the question is how big of a problem it is.

This is not quite how I see it. For me it's not about marking a particular type of polymorphism, the reason it makes sense to have a keyword in front of P is simply that the function doesn't actually return P. It returns something that is related to P, but not "a P", because that doesn't make sense.

So as I see it, I would like to hear an argument for why it makes sense to reuse the conformance syntax. Or, from another point of view, you could say we are abusing the type syntax itself:

func foo(p: some P) is fundamentally different from func foo(x: X)

and

func foo() -> some P is fundamentally different from func foo() -> X

On the right hand side, we accept or return a specific, fixed type. On the left, the type is not fixed, it is chosen by the caller in the first case, and the callee in the second case. (I argued earlier that even the two cases on the left are different, but at least they are closer).

So what I would like to see is really an argument for why the reduced clarity, in terms of using the same syntax for different things, is worth the sacrifice. That is what I think has been missing, I got the feeling that this loss wasn't even acknowledged.

I don't think it would lead to code that does something unexpected, no. My concerns are more long term, it leads to people not really understanding what they are doing, where a tiny bit of nudging would clarify things. At the same time, if keeping some just leads to a situation where people apply a fix-it then nothing would be gained. For me it just seems so natural to use a different syntax for some P than X: P that I would never forget to use it, but maybe that is unrepresentative.

So basically it comes down to this. I don't envision a future where people write generic code without knowing what generic code is. For me, some is not a way to achieve that, it's just sugar for people that actually know what generics are. Well at least if we are talking about writing functions that use some, not using others'.

So is the idea that people should be able to start off writing simple generic functions without really realising it, and only have to learn about it later when they need to introduce complex constraints etc?

If that is your vision, then I definitely see why you propose this change, then it makes a lot of sense. But I think that needs to be part of the motivation.

It might be a good idea, I have a hard time judging how realistic it is. And also how much it would help, I mean how often do beginners really have to write generic functions? I think a lot of app developers never do.

Ben_Cohen · June 12, 2022, 10:42pm

This is not really the description that explains that it's a bug, because you could say the same about this code:

func f() -> some BinaryInteger { 1 }
func g() -> some BinaryInteger { 1 }

var x = f()
x = g() // Error: Cannot assign value of type 'some BinaryInteger' (result of 'g()') to type 'some BinaryInteger' (result of 'f()')

Both f and g return some BinaryInteger. But the key is in the parts in parenthesis after the type. They come from different origins and so are not known to be the same type. some BinaryInteger doesn't tell the whole story.

In the case of the same function, but with an explicit opaque type, you get a variation of this:

func f() -> some BinaryInteger { 1 }

var x: some BinaryInteger = f()
x = f() // Cannot assign value of type 'some BinaryInteger' (result of 'f()') to type 'some BinaryInteger' (type of 'x')

It all comes down to the point at which you believe that barrier of opacity should be created. Consider this code:

struct S {
    static func f() -> some BinaryInteger { 1 }
    var x: some BinaryInteger = f()
}

var s = S()
s.x = 0    // fine, just using general properties of BinaryInteger
s.x = S.f()   // not fine, users of S shouldn't know type(of: x) is the return type of f()

GreatApe · June 12, 2022, 10:44pm

I mean, yes perhaps. It's a fine line, it really depends on how much you insist on people understanding exactly what is going on. I tend to want to know what I'm doing, so protocol extensions made me slightly uneasy for a long time, but if we want to be a bit more pragmatic, then sure, it might make sense.

masters3d · June 13, 2022, 5:48am

There is a balance here between writing and reading the code. I am just not sure if defaulting to some is the right thing yet. some was introduced for SwiftUI and it has been extended for other usages recently. Tomorrow’s new Apple frame work might introduce something better and now we are stuck with some.

Perhaps the fix it should just write in some for me so I don’t have to even click on the fix it but that might be too heavy handed.

I think tooling like inlay hint for swift could help in the future for reading code if we choose to apply the default but this is not part of the source code so might not help much.

Moximillian · June 13, 2022, 6:52am

Ben_Cohen:

func f() -> some BinaryInteger { 1 }
func g() -> some BinaryInteger { 1 }

var x = f()
x = g() // Error: Cannot assign value of type 'some BinaryInteger' (result of 'g()') to type 'some BinaryInteger' (result of 'f()')
Both f and g return some BinaryInteger . But the key is in the parts in parenthesis after the type. They come from different origins and so are not known to be the same type. some BinaryInteger doesn't tell the whole story.

This is the kind of example where developer would need to understand how Swift actually works, in order to make sense of the error message and find another way to implement what developer wants.

With elided some this would be a disaster; developers would be forced to be ignorant, and would assume BinaryInteger or any other protocol is just like type, and expect it to behave the same as regular type. Also without the vocabulary of "some", it's back to cryptic error messages that make no sense. Developers without knowledge of what is actually happening will feel like this is a dead end.

On the other hand "some BinaryInteger" will at least give a hint that this is not a regular type and thus cannot always be expected to behave like regular type. And having "some" in the vocabulary makes it much easier for the error message to explain it, and also makes it easier to google what other people have done to get their things done.

Now, you could also create all kinds of rules where in some places "some" is elided, and some places it's still required, but from the perspective of novice developer those kind of rules are arbitrary, and would again result in a whack-a-mole, because that elision effectively prevents forming a coherent understanding of what is actually happening.

benlings · June 13, 2022, 7:23am

I think this would need generic local variables (which I think were mentioned in the 'Improving the UI of Generics' post):

let <T: P> f: (T) -> () = takesBareP

If it's possible to handle these sorts of cases where the fact that a function is implicitly generic doesn't matter, I think eliding some would be a lot easier to justify.

This was very illuminating for me. I think if it were possible to write

extension some Collection { ... }

and have extension Collection { ... } be a 'default elision', this would have made the behaviour of protocol extensions more understandable to me - including why trying to do extension ProtocolA : ProtocolB isn't doing what it looks like on the surface.

In some ways protocols seem like they are quantum 'cloud like' types. If you try and narrow them down to a single type (position) using any, you lose some of their behaviour (speed). If you want to retain all their behaviour (speed) using some, you can't know their exact type (position).

ExFalsoQuodlibet · June 13, 2022, 7:50am

I think this is a very compelling example of why we should not elide some: it's useful, it gives extra information, it's a "hint" that the developer must take into account some characteristic of the language that might be overlooked if we only spell the plain protocol.