Every pattern I ever wrote is here within this walls
I'm sorry, language nerds, but I'm blocking out your calls
I've had my evolution pitch, I don't need something new
I'm afraid of what I'll implement if I follow youInto the unknown
Into the unknown
Into the unknown
(With apologies to Kristen Anderson-Lopez and Robert Lopez.)
Motivation
Today, Swift supports @unknown default:
switch cases. Here's how TSPL describes the attribute's usage:
Apply this attribute to a switch case to indicate that it isn’t expected to be matched by any case of the enumeration that’s known at the time the code is compiled.
This is helpful in making sure that all cases that are known at the time of writing are covered.
However, @unknown default:
only works in the outermost context of a switch statement.
- This makes it more cumbersome to pattern-match on non-frozen enums if the value is nested; you need to create another
switch
statement to use@unknown default:
. - This poorly affects the ergonomics of potential language features such as enum cases with non-public cases. (This point was brought up by @stephencelis.)
Proposed Solution
We allow "unknown patterns" in arbitrary pattern positions inside a case item in switch
statements.
// Module M with library evolution
public enum E { case a; case b }
// Module N
import M
func f(_ e: E) {
switch (e, e) {
// note: add missing case '(.a, .a)'
// note: add missing case '(.b, .a)'
// note: add missing case '(.a, .b)'
// note: add missing case '(.b, .b)'
case (@unknown _, @unknown _): ()
}
}
We propose the spelling @unknown _
, as that reminds the reader that it has the semantics of both @unknown default
(warnings will be issued if known cases are not matched explicitly) and of wildcard patterns _
(it will match any value).
Detailed Design
(This section will be fleshed out in the full proposal, with changes to the grammar and so on.)
Pattern matching crash course
Swift's switch
statement is in the general case "2 dimensional" (we can ignore where
clauses here, as they are not relevant for checking pattern exhaustivity, also ignore default
for now):
switch subject_expression {
case pat_11, ... , pat_1A: stmt1
case pat_21, ... , pat_2B: stmt2
...
case pat_f1, ... , pat_fZ: stmtf
}
Each individual pattern in this "pattern grid" such as pat_11
and pat_fZ
is a "case item". Each such pattern is actually a little AST of more fundamental patterns, such as literal patterns, tuple patterns, binding patterns and more. (See The Swift Programming Language: Redirect
for more details.)
The core idea is the compiler thinks of these patterns as "spaces" (think: sets) and switch exhaustivness checking reduces to checking if the "total space" (dictated by the type of the subject expression) is covered by the union of the spaces formed by the individual patterns. The way this is done is that all spaces are subtracted from the total space: if there is something left over, the switch is not exhaustive, and the compiler suggests fix-its to cover the leftover space.
Checking unknown patterns
The change with this pitch is that while iterating through case items for checking switch
exhaustivess, the compiler also does "unknown checking"; if it sees a case item P
that has @unknown _
nested somewhere inside its little pattern AST, it computes the pattern space for P
, and from that, it subtracts the pattern spaces for all the case items that have been processed so far (excluding P
, since P
is being processed). If the difference is non-empty, it issues a warning and provides fix-its as before.
This approach is almost identical to what is done today for issuing diagnostics with @unknown default:
. The chief difference is that today, the compiler does "unknown checking" once after it has processed all case items, whereas the addition of @unknown _
sub-patterns means that it needs to do "unknown checking" K
times where K
is the number of case items which contain at least one @unknown _
nested somewhere inside.
Here are some more complex examples:
// Module M with library evolution
public enum E { case a; case b }
// Module N
import M
func g(_ e: E) {
switch (e, e) {
case (.b, .a): ()
// note: add missing case '(.a, .a)'
case (@unknown _, .a): ()
default: ()
}
switch (e, e) {
case (.b, _): ()
// note: add missing case '(.a, .a)'
case (@unknown _, .a): ()
default: ()
}
switch (e, e) {
// note: add missing case '(.a, _)'
// note: add missing case '(.b, _)'
case (@unknown _, _): ()
}
}
The semantics mimic what you could write by hand today: replacing @unknown _
with a binding pattern let x
and using a nested switch x { case ... ; @unknown default }
. Please feel free to ask questions if it's not clear why these examples work the way they do, or if you find the behavior to be unintuitive.
Other contexts
@unknown _
is not permitted in pattern contexts other than the case items of a switch
statement, such as for-each loops, if-let statements, guard-let statements and on the LHS of an assignment statement.
Alternatives
If @unknown _
is too verbose or not sufficiently clear, we could use a different syntax.
Prototype
A prototype for this feature is available here; you can see some code "in action" in a test case. At the time of writing, the core functionality is implemented; the pattern decomposition works and fix-its are provided for missing permutations. However, there is some polish work that needs to be done (testing for and improving parse errors, adding more pattern-matching tests and enforcing restrictions around usage in non-switch contexts). I'm looking for feedback before proceeding.
(Also, in case you'd like to chip in with implementation/testing but don't have much compiler contribution experience, this may be a good opportunity to pick off small tasks. Feel free to send me a forum message if you'd like to help.)