SE-0257: Eliding commas from multiline expression lists

Paul_Cantrell · April 11, 2019, 5:20pm

I like this idea a lot. Actually working with the feature instead of just imagining it definitely shifted my opinion on this one.

We could all use a little help remembering to be less reactionary in our reviews, and come to new ideas with a more exploratory spirit — even the ones that turn out to be bad ideas in the end.

allevato · April 11, 2019, 5:20pm

Which I think points to the problem with this proposal. The example given by @anthonylatsis is an extremely trivial case—an array with two distinct elements—where someone who would want to use comma elision throughout their code would want to use it but where they cannot. This by definition leads to inconsistent code in the same code base because some array literals can elide commas, and some cannot, for odd technical syntactic reasons. Loading the compiler with diagnostics that say "sorry, you can't use comma elision here, but here's a fix" is extremely unsatisfying. How much would we be increasing the mental burden on not only newcomers to the language, but also to more experienced users, who have to remember and apply all of these subtle details?

To take this further, I'm going to put on my "syntax tool author" hat for a moment. Right now, if I want to construct a comma-delimited list in SwiftSyntax, I construct the nodes and insert the comma tokens in the appropriate slots. I add those commas consistently, and it's easy. If this proposal is accepted, what does that mean for people who want to transform or build syntax trees that involve comma-delimited lists? The proposal does not address this at all, and none of the outcomes I see are great:

I construct the comma-delimited list and insert commas everywhere, just to be safe. But that annoys people who want to live in a comma-elided world.
I construct the list without inserting any commas. But depending on the expressions in that list, that might generate code that doesn't compile. That's totally unacceptable.
I encode a complex set of conditionals to determine when I should insert commas and when I shouldn't. As a tool author, that should not be my responsibility.
SwiftSyntax encodes a complex set of conditionals to determine when commas should be inserted or not. As far as I can tell that's the best outcome, but it's still lacking; this logic would need to be kept in sync with any future changes made to the compiler. But there's not really precedent in SwiftSyntax for constructing nodes where something is present "only when necessary"; it's a barebones syntax tree representation where tokens are either there, or they aren't.

All of this to me says that if comma elision can't work universally, we shouldn't have it. There's simply too much muddiness in the exceptions, and if I did decide that I wanted to elide commas in my own codebases, or if I was working in another code base that did, I know I would be annoyed every single time the compiler tells me "sorry, you forgot your mandatory comma here". And because of the syntax tool problems mentioned above, it is non-trivial for an automated formatter to do the right thing in every case.

So, if consistency and learnability is what we're aiming for, this proposal simply doesn't achieve that. If the Core Team feels that consistency is a critical motivating factor, then let's just revisit the original SE-0084 and add trailing commas. That is the only way to make comma-delimited lists in the language completely consistent, and it also resolves all of the issues above with the syntax tool use cases.

Preston_Sumner · April 11, 2019, 5:30pm

Does it? It certainly wouldn't bother me, especially in generated code.

allevato · April 11, 2019, 5:33pm

Not just generated code—transformed code you've already written. If you prefer comma elision wherever it can be applied and you run a formatter that needs to wrap a comma-delimited list onto multiple lines, are you saying you'd be ok with having it preserve those commas? Wouldn't you want it to produce the code that you prefer to see?

If so, can you say that for everyone?

Preston_Sumner · April 11, 2019, 5:49pm

I'd be fine with my transformed code having commas, but I can't speak for everyone. It's a good question.

anandabits · April 11, 2019, 6:06pm

I wouldn’t. I would expect a formatter to preserve comma-delimiting.

I think the best way to address the formatter topic is that a formatter should follow the rule suggested by @John_McCall that each list should either have commas or not have commas. They shouldn’t be mixed. With this rule in place, a formatter would preserve the choice of the author. If the author wrote a mixed-comma list I think the formatter should insert commas everywhere to be conservative. If the author does not want commas they can be manually removed.

As I have said before, I think comma elision is wonderful sugar when used judiciously. I don’t think it should be applied everywhere it is possible. Deciding when to use it requires both technical and aesthetic judgment. I expect the most common cases where it makes sense will be EDSLs and literals made up of raw data.

stephencelis · April 11, 2019, 6:10pm

Paul_Cantrell:

The real clincher for me is what the proposal does to SwiftPM package config:
let package = Package(
    name: "Paper"
    products: [
        .executable(name: "tool", targets: ["tool"])
        .library(name: "Paper", targets: ["Paper"])

Does this actually compile? Shouldn't it treat .library(…) as a method call against the previous line?

Preston_Sumner · April 11, 2019, 6:15pm

To be fair, I don't think that represents the spirit of the proposal as it is written. I think comma elision would become a default convention and that the very introduction of the feature would imply that it should be such.

allevato · April 11, 2019, 6:16pm

I can imagine a user having the opinion that "I prefer comma elision wherever possible for multi-line lists". In that case, if a line needs to be wrapped, what the user typed on a single line may not match the "choice" of the author:

// Before wrapping
let someArrayWithALongName1: [SomeType] = [someLongVariableName, someOtherLongVariableName]
let someArrayWithALongName2: [SomeEnum] = [.someLongEnumName, .someOtherLongEnumName]

// After wrapping
let someArrayWithALongName1: [SomeType] = [
  someLongVariableName,
  someOtherLongVariableName
]
let someArrayWithALongName2: [SomeEnum] = [
  .someLongEnumName,
  .someOtherLongEnumName
]

If the formatter wants to honor the user's preference, it must decide whether it's possible to remove those commas—and a human can see that the answer is different between them. Asking them to manually remove them would be viewed as a deficiency of the tool, which is why I point out the difficulty involved to have the tool make that decision due to the limitations and exceptions of the syntax being proposed.

anandabits · April 11, 2019, 6:27pm

Personally, I don’t think this would be an appropriate option for a formatter to offer. If it needs to wrap a non-wrapped line it should leave the commas in place. If the user wants a multi-line list without commas the user should explicitly make the choice to omit commas.

This can only be viewed as a deficiency by users with unreasonable expectations of the tool and the language. IMO the fact that a formatter can’t automatically decide when to elide commas is not an argument against allowing that. The moment a formatter and not the language starts telling me what constitutes allowable syntax is when the formatter has gone too far. I want my code to have the best style possible.

A formatter should focus exclusively on helping me get there as much as it can while always embracing the “first, do no harm”. That principle would be violated if we decide to reject syntax simply because a formatter can’t make an automatic choice about when to use it.

Paul_Cantrell · April 11, 2019, 6:34pm

Indeed, you're quite correct! That is a bummer, and certainly puts a crimp in the proposal.

anandabits · April 11, 2019, 6:40pm

I want to point out that the package manager DSL was designed without this proposal in mind. It may be possible to design it differently if supporting comma elision was a goal. If Enhanced Variadic Parameters moves forward labeled variadic parameters may also be useful.

For example, it might be possible to re-skin it so:

package = Package(
    name: "Paper"
    products: [
        .executable(name: "tool", targets: ["tool"])
        .library(name: "Paper", targets: ["Paper"])

becomes:

package = Package(
    name: "Paper"
    products: [
        tool: .executable(targets: ["tool"])
        Paper: .library(targets: ["Paper"])

allevato · April 11, 2019, 6:54pm

There are formatters that exist today for various languages (I'll use prettier and Typescript as an example) that add or remove the trailing comma in a list depending on whether the list is laid out horizontally or vertically. The hypothetical preference that I mentioned seems like it would be a reasonable extension of that.

I'm having difficulty squaring your position here with the one that you expressed in the thread about the style guide/formatter (please correct me if I'm misinterpreting your intent):

Isn't comma elision precisely the kind of idiom that you're suggesting that some programmers would want to adopt for DSL-like use cases? If so, and if hypothetically a highly configurable formatter were to exist, then it's entirely reasonable that a user adopting that idiom would want that formatter to automatically handle it for them and not be told that they have to remove the commas yourself (i.e, "fighting the formatter").

If the language were to support comma elision, then transforming code to use it in cases where possible if the user wishes it doesn't strike me as harmful—rather the opposite. Why is it the formatter's fault, rather than the proposed syntax's fault, if the tool is unable to get the user farther than it can due to inherent inconsistencies in the syntax?

"[...] decide to reject syntax simply because a formatter can't make an automatic choice" isn't the argument I'm making—I mentioned it to illustrate one of many implementation difficulties that arise from the inherent inconsistencies being proposed, and a proposal that suggests such a fundamental change in syntax needs to consider these things. But it's not even formatters that have this problem—clearly human authors are also having difficulty reasoning about the proposed syntax, based on the responses in this thread trying to determine which situations work and which wouldn't.

The fact that SwiftPM—the very kind of DSL that would hope to benefit from this—would need to be completely redesigned to do so just because it uses collections of enum cases is a hard pill to swallow. Using enums in this manner is a fundamental pattern in Swift, and if the syntax can't support it, then that's a very significant limitation.

breathe · April 11, 2019, 7:14pm

A formatter should focus exclusively on helping me get there as much as it can while always embracing the “first, do no harm”. That principle would be violated if we decide to reject syntax simply because a formatter can’t make an automatic choice about when to use it.

I disagree thoroughly with this philosophy. A formatter should free the author from having to think about the formatting of code — and give the reader the most predictable looking and easiest to read output.

I think a really nice way to honor the work and desire of this proposal would be to incorporate comma-elision logic into a formatting tool capable of rewriting code with elided commas into code that has commas.

All the benefits are there — ease of copy/paste, easier to write as you don’t have to think about commas when writing the code, and unambiguous to parse/read. This is the approach the JavaScript formatter prettier takes with semicolons (it adds them everywhere but you don’t have to write them yourself) and it works great.

It would also be possible with such an approach to possibly pursue a support for rewriting comma ellided expressions beginning with . via a white space convention in a source compatible way. A rule that treat indented lines are multi line expressions, otherwise treat as comma-ellided paramer lists could be supported by the formatter in a source compatible way: The formatter could stick a comment in the document the first time it’s run and automatically rewrite multi line expressions into the white space sensitive form on first run to preserve semantics - then on subsequent runs could use whitespace rule to support unambiguous rewriting of comma eluded parameter lists with initial . character into comma delimited form...

anandabits · April 11, 2019, 7:28pm

Are these formatters targeting languages without potential for the kind of issues and ambiguities that have been discussed abundantly in the threads about this topic? My logic wouldn't apply in those languages, but in Swift this is something that needs to be an explicit choice by a programmer. It isn't something that can be safely applied by a formatter. Even with a bunch of logic it would be difficult for a formatter to identify a consistent set of contexts in which this style is a good choice. IMO, that is ok and is not a reason to reject the feature.

My argument in the other thread is that the user shouldn't have to fight the formatter. A formatter that chooses to line wrap without choosing to elide commas does not cause the user to fight with it. What I was referring to is a scenario where the formatter exhibits undesirable behavior that removes or ignores an explicit choice the user has made. Restoring commas to a multiline list where they had been omitted is an example of a formatter causing the user to fight with it.

I don't think it's reasonable to expect a tool to be able to automatically apply every style that can be applied and used well by humans. I think you would agree with that, wouldn't you? IMO, this just falls into the category of a style that can be used and applied well by humans given our contextual knowledge, but cannot be universally applied by a tool. I just don't see it as a problem.

Human authors can also have a lot of trouble reasoning about complex expressions involving operators that omit parentheses. Should we require parentheses everywhere? I don't think anyone would support that. Everyone knows that it's important to know when to use parentheses. I don't see why commas should be treated any differently. The biggest difference that I can see is that comma elision is unfamiliar to most of us and that is stirring up a lot of FUD. I believe we will learn how to use this feature effectively just as we have learned to use (and omit) parentheses effectively.

The one concern that has been raised that I think is significant is the claim that this proposal would inhibit future evolution of the language. That still seems to be an unresolved dispute between @Douglas_Gregor and @Chris_Lattner3. If it would indeed inhibit future evolution that impact should be weighed very carefully. Otherwise, I think the concerns have been overstated and this is something people would get used to, just as they have with semicolon and parentheses elision.

If there were a viable alternative I would agree with you. But there isn't. The choice is comma elision or no comma elision. We don't usually criticize proposals for existing libraries with existing designs not being able to take full advantage of the proposal immediately. I don't understand why it is fair to criticize this proposal on that basis. Again, if there was a viable alternative this would be a fair criticism, but there isn't.

anandabits · April 11, 2019, 7:31pm

IMO these goals are at odds with each other. The principle I adopt is that code is read far more often than it is written. Formatting choices should therefore be made for the benefit of the reader. A formatter is a tool that can help ensure consistency, especially in the areas of microformatting decisions. But it is no substitute for care by the author. There are a lot of formatting choices that can benefit a reader and cannot be automated. An important role of the author is to make these choices. The role of the formatter in this context is to allow the author to make these choices, while helping them as much as possible with consistency in smaller details.

anandabits · April 11, 2019, 7:38pm

I'm sure I could find a lot of Lisp programmers who would say the same thing about parentheses. People can get used to all kinds of syntax. Just because we're all used to requiring commas right now does not mean that they make the language better.

allevato · April 11, 2019, 7:41pm

But that's precisely the point I'm making and why I come down on the side of "no comma elision"—it doesn't and can't fully and consistently solve the problem that it wishes to.

For what it's worth, I could be convinced of "yes, comma elision" if it could be applied successfully to all scenarios where a comma-delimited list was laid out vertically and have the commas removed. While I still don't think the visual weight of commas is as severe as the proposal claims, I would be more supportive (and might even adopt it!) if it could be done universally. But that's not possible, because something as simple as this could have two different possible interpretations:

let x: [SomeEnum] = [
  .y
  .z
]

Based on that, I can appreciate that your opinion differs on this and that the language should support it where it can, but I simply don't think the wins are significant enough to introduce more inconsistency in the language to make it possible only in some cases.

breathe · April 11, 2019, 7:42pm

The principle I adopt is that code is read far more often than it is written. Formatting choices should therefore be made for the benefit of the reader . A formatter is a tool that can help ensure consistency, especially in the areas of microformatting decisions. But it is no substitute for care by the author

This feels like words from someone who has not spent much time using a (really exceptionally good) automatic code formatter (edit: apologies I didn't mean that to be disparaging — i meant that comment based my experience with code formatters of lower quality vs say - javascript's prettier). JavaScript’s prettier in my experience never transforms hand-crafted artisanally formatted code into a less readable version. I would also argue that time spent reading code is often most fruitfully accompanied by editing that code — the speed with which a language learner can get feedback about unfamiliar language constructs by interacting with code in the presence of a good formatting tool shouldn’t be ignored — you can often come across some syntax you aren’t 100% familiar with, add some lines or characters, rerun the formatter, observe how the formatter thinks the code should look to learn something about the intended semantics of a construct without having to read documentation.

Embracing opinionated automatic formatting as a language feature would be to me an acknowledgment that editor-level tooling can and should help reduce the grammarian burdens of the code authoring experience. In this case, I think baking heuristics into the formatter to be able to understand comma-ellided parameter lists so that it can rewrite them with commas would deliver all the benefits of this proposal while also preserving precise unambiguous readability ...

anandabits · April 11, 2019, 7:46pm

I'll repeat the line of reasoning based on parentheses elision. We don't demand that it be possible to omit parentheses in 100% of cases in order to achieve the desired behavior. Why should we demand this of commas? It might be reasonable to argue that parentheses elision is much more important to the ergonomics of the language than comma elision, but that is a subjective determination. My point is that these are not categorically different and (obviously) reasonable people are making different determinations.