Pitch: Multi-statement if/switch/do expressions

Ben_Cohen · November 14, 2023, 1:07am

A follow-on to SE-0380. Keen to hear everyone's thoughts on the alternatives outlined at the end. Available as a PR here.

Multi-statement expressions using `then`

Authors: Ben Cohen, Hamish Knight
Review Manager: TBD
Status: Awaiting Implementation
Implementation: available on main via -enable-experimental-feature ThenStatements and -enable-experimental-feature DoExpressions

Introduction

This proposal introduces a then keyword, for the purpose of determining the value of an if or switch expression that contains multiple statements in a single branch. It also introduces do expressions.

Motivation

SE-0380 introduced the ability to use if and switch statements as expressions. As that proposal lays out, this allows for much improved syntax for example when initializing variables:

let width = switch scalar.value {
    case 0..<0x80: 1
    case 0x80..<0x0800: 2
    case 0x0800..<0x1_0000: 3
    default: 4
}

where otherwise techniques such as an immediately-executed closure, or explicitly-typed definitive initialization would be needed.

However, the proposal left as a future direction the ability to have a branch of the switch contain multiple statements:

let width = switch scalar.value {
    case 0..<0x80: 1
    case 0x80..<0x0800: 2
    case 0x0800..<0x1_0000: 3
    default: 
      log("this is unexpected, investigate this")
      4  // error: Non-expression branch of 'switch' expression may only end with a 'throw'
}

When such branches are necessary, currently users must fall back to the old techniques.

This proposal introduces a new contextual keyword, then, which allows a switch to remain an expression:

let width = switch scalar.value {
    case 0..<0x80: 1
    case 0x80..<0x0800: 2
    case 0x0800..<0x1_0000: 3
    default: 
      log("this is unexpected, investigate this")
      then 4
}

then can similarly be used to allow multi-statement branches in if expressions.

The introduction of this keyword also makes stand-alone do expressions more viable. These have two use cases:

To produce a value from both the success and failure paths of a do/catch block:
```
let foo: String = do {
    try bar()
} catch {
    "Error \(error)"
}
```

The ability to initialize a variable when this cannot easily be done with a single expression:

let icon: IconImage = do {
    let image = NSImage(
                    systemSymbolName: "something", 
                    accessibilityDescription: nil)!
    let preferredColor = NSColor(named: "AccentColor")!
    then IconImage(
            image, 
            isSymbol: true, 
            isBackgroundSupressed: true, 
            preferredColor: preferredColor.cgColor)
}

While the above can be composed as a single expression, declaring separate variables and then using them is much clearer.

In other cases, this cannot be done because an API is structured to require you first create a value, then mutate part of it:

let motionManager: CMMotionManager = {
    let manager = CMMotionManager()
    manager.deviceMotionUpdateInterval = 0.05
    return manager
}()

This immediately-executed closure pattern is commonly seen in Swift code. So much so that in some cases, users assume that even single expressions must be surrounded in a closure. do expressions would provide a clearer idiom for grouping these.

Detailed Design

A new contextual keyword then will be introduced. if and switch expressions will no longer be limited to a single expression per branch. Instead, they can execute multiple statements, and then end with a then expression, which becomes the value of that branch of the expression.

Additionally do statements will become expressions, with rules matching those of if and switch expressions from SE-0380:

They can be used to return vales from functions, to assign values to variables, and to declare variables.
They will not be usable more generally as sub-expressions, arguments to functions etc
Both the do branch, and each catch branch if present, must either be a single expression, or yield a value using then.
Further if, switch, and do expressions may be nested inside the do or catch branches, and do expressions can be nested inside if and switch expressions.
The do and any catch branches must all produce the same type, when type checked independently (see SE-0380 for justification of this).
If a block either explicitly throws, or terminates the program (e.g. with fatalError), it does not need to produce a value and can have multiple statements before terminating.

Nested use of `then`

If needed, a then must be the last expression in a branch. Allowing it in other positions, and all paths to be checked as producing a value using Swift's definite initialization feature, would lead to similar complexities to those that caused control flow like break,continue, and return, to be ruled out during SE-380.

A then keyword only applies to the innermost if, switch, or do - it cannot apply to an outer expression even if e.g. the inner if is not an expression. For example, the following code will not compile:

let x = if .random() {
  print("hello")
  if .random() {
    then 1 // this `then` is intended to apply to the outer `if`
  } else {
    then 2
  }
} else {
  3
}

and should be rewritten as:

let x = if .random() {
  print("hello")
  then if .random() {
    1
  } else {
    2
  }
} else {
  3
}

If the inner branches above also needed a then, this could still be used:

let x = if .random() {
  print("hello")
  then if .random() {
    print("world")
    then 1 // this then applies to the inner if exression
  } else {
    2  // then not needed here, though it would be allowed
  }
} else {
  3
}

A then cannot be nested inside the else of a guard even though this might be considered the "last statement":

let x = if .random() {
  guard .random() else {
    then 0
  }
  then 1
} else {
  0
}

as this implies that guard is also an expression (a future direction of SE-380 that could still be explored further) and that you could replace the above guard with an if, which would not be valid.

Parsing Ambiguities with `then`

then will be introduced as a contextual keyword, with some heuristics to preserve source compatibility in all but rare cases. Similar rules were applied to await when it became a new contextual keyword.

To ensure existing use of then as a variable name continues to work, a heuristic will be added to avoid parsing it as a keyword when followed by an infix or suffix operator:

// without heuristic, this would produce
// error: 'then' may only appear as the last statement in an 'if' or 'switch' expression
then = DispatchTime.now()

Prefix operators would be permitted, allowing then -1 to parse correctly. then - 1 would parse as an expression with then as a variable. This follows similar existing rules around whitespace and disambiguation of operators.

Similarly:

then( is a function call, then ( is a then statement.
then[ is a subscript, then [ is a then statement
then{ & then { are always trailing closures. If you want a then statement you have to do then ({...})

This does mean that then /^ x/ would parse /^ to be an infix operator. This is not a problem with the similar case of return /^ x/ because return is not a contextual keyword (you can't do e.g func return or let return). then #/^ x/# would parse as a regular expression.

then.foo is a member access, then .foo is a then statement, as is:

then
  .member

If member access was still desired, back ticks could be used:

`then`
  .member

This is a potential (albeit unlikely) source break, but the back tick fix can be applied to the 5.9 compiler today to ensure existing code can compile with both the old and new compiler.

With these rules in place, the full source compatibility suite passes with this feature enabled.

Alternatives Considered

Many of the alternatives considered and future directions in SE-0380 remain applicable to this proposal.

The choice of the keyword then invites bikeshedding. Java uses yield – however this is already used for a different purpose in Swift.

Many languages (such Ruby) use a convention that the last expression in a block is the value of the outer expression, without any keyword. For example:

let width = switch scalar.value {
    case 0..<0x80: 1
    case 0x80..<0x0800: 2
    case 0x0800..<0x1_0000: 3
    default: 
      log("this is unexpected, investigate this")
      4  // would now be allowed, with no `then` keyword.
}

This has the benefit of not requiring the a whole new contextual keyword. It can be argued that the last expression without any indicator to mark the expression value explicitly in multi-statement expressions is subtle and can make code harder to read, as a user must examine branches closely to understand the exact location type of the expression value. On the other hand, this is lessened by the requirement that the if expression be used to either assign or return a value, and not found in arbitrary positions.

Note that if bare last expression became the rule for if and do, it raises the question of whether this also be applied to closure returns also, and perhaps even function returns, which would be a major and pervasive change to Swift (though opinions would likely be split on whether this was an improvement or a regression).

A variant of the bare last expression rule can be found in Rust, where semicolons are required, except for the last expression in an if or similar expression. This rule could also be applied to Swift:

let width = switch scalar.value {
    case 0..<0x80: 1
    case 0x80..<0x0800: 2
    case 0x0800..<0x1_0000: 3
    default: 
      log("this is unexpected, investigate this"); // load-bearing semicolon
      4  // allowed as the preceding statement ends with a semicolon
}

This option likely works better in Rust, where semicolons are otherwise required. In Swift, they are only optional for uses such as placing multiple statements on one line, making this solution less appealing.

Source compatibility

As discussed in detailed design, there are rare edge cases where this new rule may break source, but none have been found in the compatibility test suite. Where they do occur, backticks can be applied, and this fix will back deploy to earlier compiler versions.

Effect on ABI stability

This proposal has no impact on ABI stability.

taylorswift · November 14, 2023, 1:28am

is a newline allowed between then and the opening [?

if  x
{
    ...

    then
    [
        aaa,
        bbb,
        ccc,
    ]
}

i’ve always found the extra indentation needed after return to be really obnoxious.

Ben_Cohen · November 14, 2023, 1:32am

yes

ellie20 · November 14, 2023, 1:41am

I'm excited for this feature. It's a better alternative to the { closure hacks }() that we have to do now.

I will say it feels unnatural that a then statement only applies to the innermost if, switch, or do, because it's dissimilar to return and break in typical patterns of Swift code, like returning in one branch of an if but not another, or making an early return within a guard statement.

How about making a then statement apply to the innermost if, switch, or do that isn't in statement position? Or in other words, the innermost if, switch, or do that's assigned to a variable or used within another expression. That should rule out the common control flow patterns, since it's currently impossible to use control flow within an expression; while still preventing "unused result" bugs involving statement-level if, switch, and do blocks. So for example:

let x = if .random() {
  print("hello")
  if .random() {
    then 1 // this `then` applies to the outer `if` because
           // the inner `if` isn't assigned to anything or
           // used within another expression
  } else {
    then 2
  }
} else {
  3
}

kiel · November 14, 2023, 2:08am

In many of these code examples, I would intuitively think (or encourage team members) to implement these as a function or method because you can name and better document the logic (improving a developer's ability to reason about the code) and you can make them more accessible to unit testing.

I get that single statement expressions are convenient for trivial cases and I could imagine using the single statement do expression. But I think readability too quickly degrades as the number of statements and branches increases (I had to read the "nested use" examples a number of times).

To combat this readability problem, perhaps this pitch could include a tool to refactor these expressions into one or more functions to discourage people from lazily adding more branches/sprinking in more thens?

tera · November 14, 2023, 3:06am

I like the alternative of using a bare last expression. It extends the "single statement" rule naturally – in fact if it is chosen then there is no need for a "single statement" rule, just the "last statement" rule. IMHO it could be consistent and apply to closures as well.

Paul_Cantrell · November 14, 2023, 3:09am

I’m happy to see this proposal, and I’m a +1 on the idea.

The then keyword does indeed invite bikeshedding…but honestly it’s far better than anything I’m able to come up with, both for readability / fluency and for its impact on the language. I’m a definite -1 on trying to shoehorn yield into this role, don’t like my own result proposal from the original SE-0380 discussion, and haven’t heard any alternatives that come remotely close to working any better. (I spent a minute considering yielding, and my conclusion was “yuck.”)

While the “last expression” rule works fine in Ruby and Rust, and while I still hear Joe Groff’s voice in my head (I think it was Joe) saying that maybe type checking reduces the unexpected accidental return values it causes in Ruby, ultimately my gut says that it’s just not very Swift-y.

This proposal’s use of then does fly in the face of the meaning of then in existing languages with if…then…else syntax — then more typically marks the end of the condition and the beginning of the first branch, not the end of a branch — but somehow it kind of works. I’m good with it.

I like the do portion of the proposal very well. The closure tricks we use now are fine, usually, mostly, but have always sat awkwardly with me. The proposal is a clear improvement.

Looking over my original concerns with SE-0380, this proposal squarely addresses everything I wrote about my first concern in that post.

At some point, I’d love to see that second concern addressed as well. I still think my suggestion of additionally allowing condition subexpressions anywhere they’re directly enclosed by parens was a good suggestion, so that e.g. this works:

print(
  if thingy.isWacky {
    "Wacky thingy \(thingy.name)!"
  } else {
    "nothing to see here"
  }
)

…but that’s for another proposal.

sspringer · November 14, 2023, 3:41am

You'd like to replace an 1 by if condition { 2 } else { 1 }, but your suggestion would then demand moving the then. I think as a general rule, do not destroy composability without a strong reason.

ellie20 · November 14, 2023, 3:46am

I don't see what you mean. This would still work:

let x = if .random() {
  print("hello")
  if .random() {
    then if condition { 2 } else { 1 }
  } else {
    then 2
  }
} else {
  3
}

or even this, if necessary:

let x = if .random() {
  print("hello")
  if .random() {
    then if condition {
        then 2
    } else {
        then 1
    }
  } else {
    then 2
  }
} else {
  3
}

It's just that putting the then statement within an if statement would also work, allowing the same control flow patterns as return and break.

sspringer · November 14, 2023, 3:50am

Ah OK, you are not demanding the shifting of then, it is just another possibility. But I still do not understand why you would need this.

ellie20 · November 14, 2023, 3:57am

To allow for certain control flow patterns that are equivalent to those with return and break. For example, the following code mentioned in the pitch would be allowed:

let x = if .random() {
  guard .random() else {
    then 0 // applies to the `if`, because the enclosing `guard`
          // is an expression, not a statement
  }
  then 1
} else {
  0
}

Returning early from a guard, if, or switch statement is a common pattern in Swift code. If they weren't allowed, people might get frustrated when they start to write more complex code within an if/switch/do expression, which would seemingly use a different "flavor" of control flow compared to functions, closures, and loops.

RussBaz · November 14, 2023, 4:29am

I personally prefer implicit return statements like @tera mentioned, but I am afraid that I am being biased due to my previous exposure to languages with this feature (such as F# in my case). On the other hand, I kind of agree with @Paul_Cantrell that it does not feel very swift-y at the moment (but what if the definition of what is swift-y changes when version 6.0 comes out?). While the feature is definitely needed, I am personally conflicted on which way I will like more right now.

stackotter · November 14, 2023, 7:04am

I agree, then somehow feels way too 'scripting language' to me, and adds to the ever-growing pile of Swift keywords... Assuming another keyword is chosen I'd still likely prefer bare last expressions (but I'm biased cause Rust development is what makes me money )

Having said that, I can see why bare last expressions wouldn't gel with some people. Perhaps we need some more complicated code examples using then vs no keyword to be able to come up with concrete reasons why no keyword would/wouldn't work. Happy to adapt some existing code examples from some of my codebases closer to the end of the week once I've got a bit more time.

I'm also yet to come up with any alternative keywords, yielding and yield are the only ones that come to mind, which would be incredibly confusing given their existing usage of course.

Nathan_Gray · November 14, 2023, 7:23am

I’m happy to see this being addressed! To me this is a no-brainer: bare last expression is the way to go. Try reading this proposal and imagining which parts you could delete if then wasn’t required. All the complexity simply melts away.

The main argument against it—that it would be hard to understand—is undercut by the fact that it’s table stakes in so many other programming languages. At my last job we used Scala and tended to hire Java programmers and then train them in Scala. “The last expression is the result” was just never a problem in learning the language or reading the code.

GreatApe · November 14, 2023, 7:24am

This new keyword seems to carry very little weight. If this kind of expression is important; wouldn’t the natural choice be to use return?

sliemeobn · November 14, 2023, 9:55am

I'd love to see this, but I am firmly in camp "bare last expression" here.

then seems to be the best keyword choice, but even it feels out of place and does not read well in context (at least to my brain). when looking at the parsing complexities/usage rules it would bring with it I am even more skeptical.

to the point of "bare last expressions" not being swifty:
I would have agreed a few years ago, but with return-less single-expression getters and functions being widely used, and the introduction of if/guard/switch expressions, I would say swift is already halfway there. (Even result builders work in this direction - explicit returns have been fading away).

To me it feels only natural to be able to turn this:

var computed: Something { x + y }

into

var computed: Something { 
print("hi mom")
x + y
}

without a lot of extra song and dance. So I am all for bare expressions all the way ; )

Bonus thought: At least to me it feels that "modern" software development employs more and more functional patterns in everyday code. Swift is quite good at that, but the discrepancy between single-expression blocks and handful-of-lines blocks has always been a bit cumbersome. Bare last expressions would unify this experience.

tevelee · November 14, 2023, 10:58am

This statement is surprising to me. I haven't encountered any formal introduction of yield keyword in Swift. I believe it shouldn't be disregarded as a potential keyword, especially when it aligns well with the context.

As far as I remember, it was expected to be introduced through the modify accessor pitch, but it didn't proceed through the Swift evolution process. After researching other Swift evolution proposals, I found no mentions of yield. Additionally, it's not documented in The Swift Programming Language (TSPL) book.

arennow · November 14, 2023, 12:06pm

I suspect that the general sentiment weighs against me in this, but I oppose the proposal as a whole.

I think allowing multi-statement branches within a compound expression is an invitation to write very hard to read code. Specifically, the kind of code that makes sense as you're actively working on it, but later becomes a tarpit for your eyes and brain.

The most compelling use-case to me was this one:

let width = switch scalar.value {
    case 0..<0x80: 1
    case 0x80..<0x0800: 2
    case 0x0800..<0x1_0000: 3
    default: 
      log("this is unexpected, investigate this")
      then 4
}

We've all been there – it's a bit frustrating when you need to add a logging statement to a branch. But in this case, the option of turning the just default branch into an immediately executed closure sufficiently ameliorates the problem. (And also places the more complex syntax only in the more complex branch.) If you just need the log for debugging while working on the code (that is, if you're not planning on committing the logging), then that's fine. But if I were reviewing a PR that had a random function call with side-effects in the middle of an expression to assign a value to a variable, I'd ask the author to refactor. It's unexpected that the right-hand side of a statement that begins with let width = would have side-effects unrelated to determining the value.

The thing that this proposal brings most immediately to mind is C's comma operator, which is used for similar shenanigans. For those not familiar, in C, the comma operator allows the user to provide two expressions. It evaluates the first and discards its result, then evaluates the second, and that result becomes the value (and type) of the overall expression. For example, in int x = (puts("foo"), 5);, x has the value 5. Or sillier, more terrible things:

int x = 5;
int y = (x*=2, x+=1, x-3);
printf("x: %i\n", x); // x: 11
printf("y: %i\n", y); // y: 8

For those saying "any feature can be used to write hard-to-read code", that's certainly correct, but language design is at least partly about steering users into making better decisions, such as writing clearer code. That's why the try and await keywords exist – purely for human readers (and sometimes for overload disambiguation).

This forum is probably mostly frequented by very experienced Swift developers, which can blind us to the silly things less experienced devs would do with a new feature. If the benefits are high (such as async/await or the forthcoming typed throws), that's an good tradeoff, but in this case, I think the benefit is so minimal that the cost isn't worth it.

On the matter of then versus bare last expression, I think requiring the keyword makes it clearer that that particular if/switch/do statement is "weird" in the sense of also being an expression. When users inevitably write 300-line-long switch expressions, I think we'll all be thankful for the thens, even though they're ugly.

DevAndArtist · November 14, 2023, 12:39pm

Just a quick alternative option. One could turn then into use.

if ... {
  use ...
} else {
  use ...
}

It seems also to word better with the switch.

jeremyabannister · November 14, 2023, 1:50pm

In previous discussions on this topic I provided what is for me the most compelling argument against the "last expression" rule:

Basically it boils down to that result builders already gave dangling expressions a different meaning (namely, that all of them get used, not just the last one), and since there is no visual indicator that a result builder is in effect in a given context, giving dangling expressions a fundamentally different meaning in non-result-builder contexts feels to me like it could be disastrous for readability.