When to use a raw identifier or not? (SE-0451)

Traditionally, in Swift, you had to spell things wrong.

enum UTF16Character { }

That has now changed.

enum `UTF-16 Character` { }

The option to spell incorrectly without backticks, or correctly with them, now adds cognitive load. Are there guidelines?

I think most people have adopted the guideline of "this is fine for the names of @Test methods and suites" as well as the existing usage of "it's situationally OK if a keyword is really the best name for this identifier".

2 Likes

I’m a fan of nesting, eg UTF.16.Character.

It’s also good for Units, I think normally variable names can’t be begin with a number:

static let `20 MB` = 20 * 1024 * 1024

In my option C++ has a nicer solution with user defined literals.

Some thoughts as the proposal author:

Don't use a raw identifier just for the sake of it.

Two places where it really shines are highlighted in the proposal:

  • Names of test methods, where the goal is to be descriptive and using a separate description attribute would be redundant. These aren't API so you never have to worry about other callers.
  • Code generation based on names from some other source that might not be valid Swift identifiers.

All the examples cited so far in this thread, other than tests, I would actually not recommend using raw identifiers for. I wouldn't consider UTF16Character to be egregiously "spelled wrong" enough to warrant using any of the given examples to replace it.

Like many things in Swift, raw identifiers are one of the tools in your toolbox. If we consider it a screwdriver, use it to turn screws but don't take it out if your job is hammering nails.

5 Likes

The wisdom behind the rules here distinguishes readability of code and prose, rather than taking prose as the dominant form, and the examples are really (the only?) special cases.

It’s likely harder to read `UTF-16 Character` as a code token, particularly when scanning, because whitespace plays such a key role for syntax-heavy context of code. So when would it be worth it?

People read semi-consciously, e.g., easily scanning prose paragraphs of misspelled words (so long as the first and last parts of the words are correct). In code, creating 3 tokens for one referent just introduces a contingent composition operation that increases the load for grokking what role the letters play in the syntax. Also, code names often use a logical prefix style that helps as a mnemonic, in type-completion, and in implicit lexical organization of an API.

For tests we want sentences that state the underlying assertion being addressed, because then we can judge the completeness of the overall suite by virtue of whether all invariants have been tested. And test methods themselves are only called implicitly by the driver, not by other code people write. So the goal is not just to distinguish or name methods but to explain, and having to do both is tedious.

For constant names for numeric values, it’s a tokenizer artifact that we have the name/value distinction to maintain; for such values with units, it’s hard to read them backwards. Also in many cases they’re being used in context with operators that set expectations. (For my tastes, I’ll continue to use the prefix style with some alphabetic term before a number instead of backticks.)

Aside from generated code, I can’t think of other good uses.

Put succinctly, it was not an oversight that SE-0451 did not alter Swift API naming guidelines.

2 Likes