SE-0257: Eliding commas from multiline expression lists

Does this actually compile? Shouldn't it treat .library(…) as a method call against the previous line?

1 Like

To be fair, I don't think that represents the spirit of the proposal as it is written. I think comma elision would become a default convention and that the very introduction of the feature would imply that it should be such.

1 Like

I can imagine a user having the opinion that "I prefer comma elision wherever possible for multi-line lists". In that case, if a line needs to be wrapped, what the user typed on a single line may not match the "choice" of the author:

// Before wrapping
let someArrayWithALongName1: [SomeType] = [someLongVariableName, someOtherLongVariableName]
let someArrayWithALongName2: [SomeEnum] = [.someLongEnumName, .someOtherLongEnumName]

// After wrapping
let someArrayWithALongName1: [SomeType] = [
  someLongVariableName,
  someOtherLongVariableName
]
let someArrayWithALongName2: [SomeEnum] = [
  .someLongEnumName,
  .someOtherLongEnumName
]

If the formatter wants to honor the user's preference, it must decide whether it's possible to remove those commas—and a human can see that the answer is different between them. Asking them to manually remove them would be viewed as a deficiency of the tool, which is why I point out the difficulty involved to have the tool make that decision due to the limitations and exceptions of the syntax being proposed.

1 Like

Personally, I don’t think this would be an appropriate option for a formatter to offer. If it needs to wrap a non-wrapped line it should leave the commas in place. If the user wants a multi-line list without commas the user should explicitly make the choice to omit commas.

This can only be viewed as a deficiency by users with unreasonable expectations of the tool and the language. IMO the fact that a formatter can’t automatically decide when to elide commas is not an argument against allowing that. The moment a formatter and not the language starts telling me what constitutes allowable syntax is when the formatter has gone too far. I want my code to have the best style possible.

A formatter should focus exclusively on helping me get there as much as it can while always embracing the “first, do no harm”. That principle would be violated if we decide to reject syntax simply because a formatter can’t make an automatic choice about when to use it.

1 Like

Indeed, you're quite correct! That is a bummer, and certainly puts a crimp in the proposal.

2 Likes

I want to point out that the package manager DSL was designed without this proposal in mind. It may be possible to design it differently if supporting comma elision was a goal. If Enhanced Variadic Parameters moves forward labeled variadic parameters may also be useful.

For example, it might be possible to re-skin it so:

package = Package(
    name: "Paper"
    products: [
        .executable(name: "tool", targets: ["tool"])
        .library(name: "Paper", targets: ["Paper"])

becomes:

package = Package(
    name: "Paper"
    products: [
        tool: .executable(targets: ["tool"])
        Paper: .library(targets: ["Paper"])

There are formatters that exist today for various languages (I'll use prettier and Typescript as an example) that add or remove the trailing comma in a list depending on whether the list is laid out horizontally or vertically. The hypothetical preference that I mentioned seems like it would be a reasonable extension of that.

I'm having difficulty squaring your position here with the one that you expressed in the thread about the style guide/formatter (please correct me if I'm misinterpreting your intent):

Isn't comma elision precisely the kind of idiom that you're suggesting that some programmers would want to adopt for DSL-like use cases? If so, and if hypothetically a highly configurable formatter were to exist, then it's entirely reasonable that a user adopting that idiom would want that formatter to automatically handle it for them and not be told that they have to remove the commas yourself (i.e, "fighting the formatter").

If the language were to support comma elision, then transforming code to use it in cases where possible if the user wishes it doesn't strike me as harmful—rather the opposite. Why is it the formatter's fault, rather than the proposed syntax's fault, if the tool is unable to get the user farther than it can due to inherent inconsistencies in the syntax?

"[...] decide to reject syntax simply because a formatter can't make an automatic choice" isn't the argument I'm making—I mentioned it to illustrate one of many implementation difficulties that arise from the inherent inconsistencies being proposed, and a proposal that suggests such a fundamental change in syntax needs to consider these things. But it's not even formatters that have this problem—clearly human authors are also having difficulty reasoning about the proposed syntax, based on the responses in this thread trying to determine which situations work and which wouldn't.

The fact that SwiftPM—the very kind of DSL that would hope to benefit from this—would need to be completely redesigned to do so just because it uses collections of enum cases is a hard pill to swallow. Using enums in this manner is a fundamental pattern in Swift, and if the syntax can't support it, then that's a very significant limitation.

2 Likes

A formatter should focus exclusively on helping me get there as much as it can while always embracing the “first, do no harm”. That principle would be violated if we decide to reject syntax simply because a formatter can’t make an automatic choice about when to use it.

I disagree thoroughly with this philosophy. A formatter should free the author from having to think about the formatting of code — and give the reader the most predictable looking and easiest to read output.

I think a really nice way to honor the work and desire of this proposal would be to incorporate comma-elision logic into a formatting tool capable of rewriting code with elided commas into code that has commas.

All the benefits are there — ease of copy/paste, easier to write as you don’t have to think about commas when writing the code, and unambiguous to parse/read. This is the approach the JavaScript formatter prettier takes with semicolons (it adds them everywhere but you don’t have to write them yourself) and it works great.

It would also be possible with such an approach to possibly pursue a support for rewriting comma ellided expressions beginning with . via a white space convention in a source compatible way. A rule that treat indented lines are multi line expressions, otherwise treat as comma-ellided paramer lists could be supported by the formatter in a source compatible way: The formatter could stick a comment in the document the first time it’s run and automatically rewrite multi line expressions into the white space sensitive form on first run to preserve semantics - then on subsequent runs could use whitespace rule to support unambiguous rewriting of comma eluded parameter lists with initial . character into comma delimited form...

1 Like

Are these formatters targeting languages without potential for the kind of issues and ambiguities that have been discussed abundantly in the threads about this topic? My logic wouldn't apply in those languages, but in Swift this is something that needs to be an explicit choice by a programmer. It isn't something that can be safely applied by a formatter. Even with a bunch of logic it would be difficult for a formatter to identify a consistent set of contexts in which this style is a good choice. IMO, that is ok and is not a reason to reject the feature.

My argument in the other thread is that the user shouldn't have to fight the formatter. A formatter that chooses to line wrap without choosing to elide commas does not cause the user to fight with it. What I was referring to is a scenario where the formatter exhibits undesirable behavior that removes or ignores an explicit choice the user has made. Restoring commas to a multiline list where they had been omitted is an example of a formatter causing the user to fight with it.

I don't think it's reasonable to expect a tool to be able to automatically apply every style that can be applied and used well by humans. I think you would agree with that, wouldn't you? IMO, this just falls into the category of a style that can be used and applied well by humans given our contextual knowledge, but cannot be universally applied by a tool. I just don't see it as a problem.

Human authors can also have a lot of trouble reasoning about complex expressions involving operators that omit parentheses. Should we require parentheses everywhere? I don't think anyone would support that. Everyone knows that it's important to know when to use parentheses. I don't see why commas should be treated any differently. The biggest difference that I can see is that comma elision is unfamiliar to most of us and that is stirring up a lot of FUD. I believe we will learn how to use this feature effectively just as we have learned to use (and omit) parentheses effectively.

The one concern that has been raised that I think is significant is the claim that this proposal would inhibit future evolution of the language. That still seems to be an unresolved dispute between @Douglas_Gregor and @Chris_Lattner3. If it would indeed inhibit future evolution that impact should be weighed very carefully. Otherwise, I think the concerns have been overstated and this is something people would get used to, just as they have with semicolon and parentheses elision.

If there were a viable alternative I would agree with you. But there isn't. The choice is comma elision or no comma elision. We don't usually criticize proposals for existing libraries with existing designs not being able to take full advantage of the proposal immediately. I don't understand why it is fair to criticize this proposal on that basis. Again, if there was a viable alternative this would be a fair criticism, but there isn't.

IMO these goals are at odds with each other. The principle I adopt is that code is read far more often than it is written. Formatting choices should therefore be made for the benefit of the reader. A formatter is a tool that can help ensure consistency, especially in the areas of microformatting decisions. But it is no substitute for care by the author. There are a lot of formatting choices that can benefit a reader and cannot be automated. An important role of the author is to make these choices. The role of the formatter in this context is to allow the author to make these choices, while helping them as much as possible with consistency in smaller details.

I'm sure I could find a lot of Lisp programmers who would say the same thing about parentheses. People can get used to all kinds of syntax. Just because we're all used to requiring commas right now does not mean that they make the language better.

But that's precisely the point I'm making and why I come down on the side of "no comma elision"—it doesn't and can't fully and consistently solve the problem that it wishes to.

For what it's worth, I could be convinced of "yes, comma elision" if it could be applied successfully to all scenarios where a comma-delimited list was laid out vertically and have the commas removed. While I still don't think the visual weight of commas is as severe as the proposal claims, I would be more supportive (and might even adopt it!) if it could be done universally. But that's not possible, because something as simple as this could have two different possible interpretations:

let x: [SomeEnum] = [
  .y
  .z
]

Based on that, I can appreciate that your opinion differs on this and that the language should support it where it can, but I simply don't think the wins are significant enough to introduce more inconsistency in the language to make it possible only in some cases.

2 Likes

The principle I adopt is that code is read far more often than it is written. Formatting choices should therefore be made for the benefit of the reader . A formatter is a tool that can help ensure consistency, especially in the areas of microformatting decisions. But it is no substitute for care by the author

This feels like words from someone who has not spent much time using a (really exceptionally good) automatic code formatter (edit: apologies I didn't mean that to be disparaging — i meant that comment based my experience with code formatters of lower quality vs say - javascript's prettier). JavaScript’s prettier in my experience never transforms hand-crafted artisanally formatted code into a less readable version. I would also argue that time spent reading code is often most fruitfully accompanied by editing that code — the speed with which a language learner can get feedback about unfamiliar language constructs by interacting with code in the presence of a good formatting tool shouldn’t be ignored — you can often come across some syntax you aren’t 100% familiar with, add some lines or characters, rerun the formatter, observe how the formatter thinks the code should look to learn something about the intended semantics of a construct without having to read documentation.

Embracing opinionated automatic formatting as a language feature would be to me an acknowledgment that editor-level tooling can and should help reduce the grammarian burdens of the code authoring experience. In this case, I think baking heuristics into the formatter to be able to understand comma-ellided parameter lists so that it can rewrite them with commas would deliver all the benefits of this proposal while also preserving precise unambiguous readability ...

2 Likes

I'll repeat the line of reasoning based on parentheses elision. We don't demand that it be possible to omit parentheses in 100% of cases in order to achieve the desired behavior. Why should we demand this of commas? It might be reasonable to argue that parentheses elision is much more important to the ergonomics of the language than comma elision, but that is a subjective determination. My point is that these are not categorically different and (obviously) reasonable people are making different determinations.

Elision in one place doesn’t mean we should allow elision everywhere, because not all kinds of elision are the same. This may be a subjective statement, but IMO the rules around parenthesis elision (simple vs complete signatures) are far more consistent and understandable than the proposed ones around comma elision, which are based on whether the thing on the next line could potentially be parsed as part of the same expression and which precludes very obvious use cases like arrays of enum cases or other implicit static member references.

1 Like

• What is your evaluation of the proposal?

-1

I understand the analogy between optional semicolons and optional commas, but it seems to me that semicolons are optional because statements are normally written on separate lines. It's convenient to use semicolons occasionally, when grouping short statements on the same line (e.g. in a one-line function definition), but that's an occasional usage.

With commas, the situation is reversed. Multiple expressions are normally written on the same line, and commas are needed as separators. Having a rule that says they can be omitted if the expressions are written on separate lines creates code that I find harder to read. I have to mentally add commas where they're not written.

I don’t think it’s helped by the fact that I also have to learn a set of exceptions where I shouldn’t (mentally) add commas.

I'd much rather have trailing commas!

• How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

Quick reading of the proposal. I’ve also been following the discussion.

2 Likes

I've moved from reluctantly +1 to reluctantly -1, because of the implications of the following:

  • I have not seen or managed to come up with a single example where a semicolon is necessary at the end of a line. (good)

  • I have seen and can come up with lots of examples where a comma will be necessary at the end of a line. (not so good)

Or put differently:

  • A semicolon at the end of a line is never needed.

  • A comma at the end of a line will sometimes be needed.


The above makes comma elision (as proposed) much more complex and confusing than semicolon elision (as can be seen from much of the discussion here).

However, note that comma elision would be exactly as straight forward as semicolon elision if we restricted it so that it was "only" allowed for:

  • dictionary literals
  • fully-labeled argument lists
  • parameter lists (unlabeled params still have eg _ arg: so they're ok)
  • fully-labeled tuples

That is, comma elision would only be allowed in contexts where commas are truly redundant (the labels/keys and colon makes them so), and (afaics) this way all commas could be elided without ambiguities, not only the ones at the end of a line.

Since the label/key and colon is effectively, well not strictly a separator, but rather a "bullet point" or introduction to each item, perhaps array literals and other un- or partially labeled constructs could be allowed provided they too used an item-introducing colon even for the unlabeled items:

let a: [SomeEnum] = [
  _: .foo
         .bar
  _: .bar
  _: .foo.bar.foo
         .bar.foo.bar
  _: foo _: foo.bar _: foo
]

print(
    _: "Hello"
    _: "world"
    separator: ", " // ;-)
    terminator: "!\n"
)

func dividingClosed(
    _ other: Polycurve
    straightIntersections: Bool = false
    threshold: Float = 1/64
) -> (
    inside: [Polycurve]
    outside: [Polycurve]
    intersections: [Float]
)
{
    ...
}

let package = Package(
    name: "Paper"
    products: [
        _: .executable(name: "tool" targets: ["tool"])
        _: .library(name: "Paper" targets: ["Paper"])
    ...

(No, the _: is not pretty, I agree, it's just a first attempt to come up with something that makes my envisioned more consistent comma elision work also for unlabeled constructs.)

Thanks for calling this out here. The proposed change does only cover the elision of commas from expression lists, not all comma separated lists.

Speaking personally, that sounds like a reasonable modification for the core team to consider.

1 Like

I agree with this as well. It would be my preference to see all comma-separated lists treated consistently throughout the language.

Problem is, that all comma separated lists throughout the language are not equally possible to treat consistently.

As mentioned above, the only ones that can be unambiguously treated are those whose items are reduntantly comma separated, ie the ones in which each item is preceded by a (label/key and) colon.