SE-0275: Allow more characters (like whitespaces and punctuations) for escaped identifiers

What is your evaluation of the proposal?

I generally like this proposal, however I think that backslash \ (U+005C) should be included in the list of invalid scalars so that it is usable later for escape sequences (whether or not Swift ever defines any in identifier context).
Or even directly define in this proposal some escape sequences in identifier context, such as ones for \ and `.

My reasoning is that since we are developing a syntax which allows almost any text as an identifier, so we should make certain to leave room to allow for the rest.

Does this proposal fit well with the feel and direction of Swift?

Yes.

How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

I read the pitch discussion and the final proposal.

6 Likes
  • What is your evaluation of the proposal?
    +1, but I think some extra restrictions on characters allowed in identifiers are important. Someone mentioned whitespaces, but some others could be interesting as well, for instance the RTL marker. Is there an accepted set of non-flow-affecting-single-line Unicode characters? Also will escaping (backslash-`) be allowed?
  • Is the problem being addressed significant enough to warrant a change to Swift?
    Yes
  • Does this proposal fit well with the feel and direction of Swift?
    Yes
  • If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?
  • How much effort did you put into your review? A glance, a quick reading, or an in-depth study?
4 Likes

+1, it seems like a logical thing to do. I agree with @owenv that other line separators defined by Unicode should also be banned.

I also wonder if we should also tweak how we emit diagnostics, because currently if I have:

func `foo`() {}
func foo() {}

then diagnostics do not include the backticks, so I get invalid redeclaration of 'foo()' (instead of '`foo`' but that's hard to read IMO). Should we apply special formatting rules to such functions for diagnostics purposes? It's not a big deal, but with the proposed ability to add whitespaces and other characters, I wonder if there are diagnostics where we could run into a problem with clarity.

  • Is the problem being addressed significant enough to warrant a change to Swift?

I think so. Personally, I haven't ever found the need for this feature in production code, however I think it's helpful in aiding the readability of function names in unit and automation tests and it's something I would personally like to use when writing tests.

In addition, being able to reference operator functions is really nice.

  • Does this proposal fit well with the feel and direction of Swift?

Yes. It seems like a natural continuation of an existing feature.

  • If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?

I have tried it once with Kotlin. I can't recall if there are any differences though.

  • How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

I read the pitch, proposal and looked at the implementation code.

3 Likes

The proposal is a wonderfully elegant solution to a handful of rough edges, and a natural extension of what the grave accents have been doing since the beginning. I probably even tried to use them like that when I was first learning Swift.

Disallowing bi‐directional controls would be unfortunate. They are one of the reasons I want this. (Right now you often need to litter code with /*[LRM]*/ to keep it legible, even if the only right‐to‐left characters are in string literals. The more places such controls can be used, the better.)

You can make your code perfectly dizzying with ASCII alone—let x = Ill1lI1||l1II||l1. Coders don’t need to be told not to write such things, and I don’t think they need help with that under Unicode either. If we try to decide which characters would and wouldn’t be useful, it will take a lot of effort and it will be for very little benefit. We also risk getting it wrong anyway just for lack of foresight.

@owenv is right though that the newlines should be treated uniformly. The line separator should fall in the same category as ASCII’s line feed. (But I don’t have an opinion on whether they should be both disallowed or both allowed.)

2 Likes

What is your evaluation of the proposal?

I think the proposal should be clearer about operators. Take this for instance:

func `+`(lhs: Int, rhs: Int) -> Int

Is this a regular function with the name + or is it an operator function? Operators have more constraints. In particular you need an operator definition to come first, the number of arguments is fixed, and you sometime need to prepend prefix or postfix before the function declaration.

Another point for operator is that you need to worry about whether the argument names matter at the call site. Should you call the function like this?

Int.`+`(1, 2)

or like this?

Int.`+`(lhs: 1, rhs: 2)

Once this is answered, we need to think if this can be used to define an operator too:

extension Int {
    static var `+`: (Int, Int) -> Int
}

Probably not, but does that make this declaration illegal?

I think the proposal need to address these questions.

Is the problem being addressed significant enough to warrant a change to Swift?

I think being able to refer to operators as functions is a big plus. But you don't need to allow arbitrary characters for this, you just need to allow wrapping operator names in +. Considering this issue alone the proposal is overblown.

Other use cases mentioned in the proposal seems contrived to me and not very useful. Do we really need a new feature so you can avoid writing camel case?

func testValidationShouldSucceedWhenInputIsLessThenTen()

The proposal mentions Quick, but looking at Quick I don't see how it'd be replaced by having a function name that includes whitespace and other things. The proposal perhaps only lacks some actual examples.

And is there any use in calling a variable "some var" other than to confuse everyone?

I'm also a bit reluctant to allow almost anything inside those strings. Surely a few well-selected space characters, operator characters and identifier characters would be enough. Is there a reason to allow control characters? nul characters? Byte-order marks? Yet-to-assign-a-meaning characters?

If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?

Nothing that compares.

How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

Read the pitch thread, and gave some thought to operators.

10 Likes

I believe the operator vs identifier questions were resolved in the pitch thread starting roughly here and that it was decided to depend on the first character (i.e. +a is an operator and a+ is an identifier.) Since that just happened to be what the compiler does already anyway, it didn’t end up requiring any work in the implementation. [Edit: I did misunderstand. @adellibovi corrected me farther down.]

It probably should have been mentioned in the proposal though. And @adellibovi can correct me if I got anything wrong in my summary.

1 Like
  • What is your evaluation of the proposal?
    I'm unsure, overall. I definitely want the ability to spell operators (Swift.Int.`+` ) But I can't help but wonder if the complexity that this adds is worth it. It seems like it is but it also seems like a large and fundamental change in terms of identifiers.
  • Is the problem being addressed significant enough to warrant a change to Swift?
    Yes.
  • Does this proposal fit well with the feel and direction of Swift?
    Again, not completely sure but I think so.
  • How much effort did you put into your review? A glance, a quick reading, or an in-depth study?
    I followed the pitch and read the proposal.
4 Likes

I completely buy the use case presented. I think it is definitely the case that a test function with a descriptive name that uses spaces can be more comprehensible and useful to the reader than a corresponding camelCasedTestNameThatIsVeryDescriptive.

However, I do not think this very persuasive use case motivates the extent of the solution proposed. To be specific, I agree with other replies here that there is something of grave concern in expanding the permitted Unicode scalars within backticks to all but a small handful. Other commenters have raised examples of characters that could in fact make any test function harder to read rather than easier to read: this works against the stated motivation for the proposal rather than for it.

Having briefly examined how Kotlin handles it, it seems that backticks in that language permit the use of spaces and a very limited number of other characters that are not otherwise allowed, but nothing as extensive as what we have proposed here.

Therefore, I would suggest that the proposed use cases would be adequately or even better served by a more circumscribed solution:

  • Allow backticks to surround identifiers anywhere that identifiers are used
  • Allow any valid operator character, non-operator identifier character, and space (the one and only ordinary one) to be used between backticks
  • It's an operator if and only if all characters inside the backticks are valid operator characters
  • Otherwise, it's a non-operator identifier

(Yes, there will be cases where a non-operator identifier could be made to look like an operator inside backticks; but lookalike collisions are already possible (and easily accomplished if one is motivated to do so) as Swift does not normalize, and this is beyond the scope of this proposal and should be addressed holistically.)

7 Likes

Just for the record now that there are other ideas flying around:

  • If the proposal were constrained to enable no new characters, but only to allow the grave accents escape operators as well as identifiers. I would be disappointed, but I would still support it.
  • If the proposal were constrained so that the accents only allowed a few privileged new characters (such as only space), I would shift to be in opposition. This would obliterate many of the use cases brought up in the pitch phase (many of which received no mention in the proposal itself). Examples include direct use of paths and URLs, easier code generation from user strings, improved legibility options for foreign scripts (especially caseless ones), etc. On its own, “I’d rather name tests with spaces instead of camel case.” just doesn’t pull its weight in my opinion.

Doesn't Swift use name-mangling when finalizing the results of compiling functions into object code (to support overloads)? And doesn't the normal name of the function affect what the mangled name will be? If yes to both, then doesn't this proposal actually affect ABI to a degree? We would have to define how exotic identifiers are mapped onto the mangled namespace.

1 Like

The mangling for these added symbols is in fact already defined. The same punycode-based encoding that gets used for Unicode characters in names also works for ASCII non-alphanumeric characters.

6 Likes
  • What is your evaluation of the proposal?

-1

  • Is the problem being addressed significant enough to warrant a change to Swift?

No. The ability to do the + an operator portion seems worth doing, but not the arbitrary characters in identifiers.

  • Does this proposal fit well with the feel and direction of Swift?

As the author said, "Swift has a beautiful concise yet expressive syntax."

Beautiful: the back ticks are ugly and make visually parsing the language unpleasant. I would not want to work in a file that was littered with this mess.

Concise: if your identifiers are concise, this feature isn't needed (the long identifiers with spaces being one of the motivating statements).

  • If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?

People mention testing without using something like Quick as a reason to support this but I don't see that this proposal would improve anything in that regard. Prose in identifiers is not an improvement. Add a comment if a concise but descriptive identifier name isn't sufficient.

  • How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

Read the proposal, read all the comments, imagined what a file full of identifiers like this would be like to look at and edit and concluded that there really doesn't seem to be any need for this and the results will be less enjoyable to interact with.

10 Likes

Thanks Owen for the feedback and review!

Good that you raised this and I want to clarify it.
Newlines are indeed scoped out the proposal, even if the grammar just mention \r, \n the rest are also not allowed, some are, in fact, invalid characters and Swift will warn and remove those.

There is available a toolchain if you want to play around it :slight_smile:
https://ci.swift.org/job/swift-PR-toolchain-osx/467//artifact/branch-master/swift-PR-28966-467-osx.tar.gz

Hi @michelf, that is indeed an important topic, so let me explain more.

Expanding the grammar for escaped identifiers will still respect any current semantic constraints.
As an example, $ dollar identifiers are still compiler-reserved names, `$identifierNames` will produce an error. This means that even if the parser allows you to potentially declare something, it does not necessary mean that semantically you can.

Same goes for operators, if an escaped identifier can be "processed", in its context, as an operator then it will be an operator and therefore any logic associated to this set of identifiers will apply.

As for this, it is an operator function, for example moving inside the declaration of a struct the compiler will prompt to Operator '+' declared in type 'Int' must be 'static' as it would do for the a non escaped version.

Thanks Jeremy for giving an insight of the pitch discussion.
I want to slightly correct this sentence and make it clear. We do not explicitly check for the first character, but instead the compiler check if the whole identifier can be an operator (obviously if the first char is not an operator we already know that the whole identifier is not an operator). This choice didn't increase complexity, instead it made simpler as those checks were already in place. For the record, `+a`() would be considered a function.

Yes, referencing operator was considered a nice side effect and not really the main goal.

I was playing around with BDD and you could do (take it as a proof of concept) something like this:

    func `test account has sufficient funds`() {
        given(`the account balance is`(100.dollars)) {
            and(`the card is valid`)
        }
        .when(`the account holder requests`(20.dollars))
        .then(`the account balance should be`(80.dollars))
    }

The different methods (`the account balance is`, `the card is valid` etc..) can be used for setting up the test or asserting a particular condition. I personally find the fact of being able to explore opportunities like this one exciting. I do also understand, and respect, that some of us may still want to prefer the camelCase option (or not).

Said so, Michel, I hope this helps for your review, in particular, I hope it clarified your doubts about operators, if not, please let me know which other questions you may have, I would be happy to reply to those :slight_smile:

Hi TJ,

Thanks for raising the topic about complexity of implementation.
I want to share that the whole proposal fits in a Pull Request with 118 additions and 61 deletions. This may give you a better idea.

Generally negative, except for the side effect of being able to reference operator functions as static members, which I think is useful.

Maybe. The proposal mentions a few “problems”. I think the issue of naming test cases does warrant some kind of change to Swift, although I’m not convinced this is the best way to handle that issue. Being able to arbitrarily name functions in normal code, I think is a non-goal, and not worth “fixing”. It would hurt readability of normal non-test code.

If accepted, the backtick syntax fits well with the language, and there is precedent in using it to escape identifiers.

Kotlin has similar feature, but a far more restrictive set of valid characters. I far prefer Kotlin’s stance over this proposal.

I followed the pitch thread, read the proposal, but did not review the actual proposed implementation PR.

9 Likes

Thanks for thoughts Svein.

I want to say something about this topic as it come up also in the pitch thread.

Having poor choices about identifiers name is an already existing issue, developers are fundamentally free to name their identifier the way they want, i.e.: the compiler will not tell us if a identifier does not follow Swift API guidelines, therefore if we want to have something unreadable we already can.

The proposal wants to give an opportunity to developers and teams to weight different options and take what feels a meaningful and readable choice for them, whatever that choice it is.

These clarifications are very sensible and greatly change my evaluation of what’s being proposed. However, the draft implementation does not suffice on its own as documentation of the proposed behavior and all of this information should be in the proposal itself.

It is critical for the evaluation of this proposal that it is laid out clearly in prose form what, in the end, will be an allowed operator or non-operator identifier.

10 Likes

Yes, I understand that. And I feel that it is a bad idea, hence my feedback. I am well aware that people are already free to write ugly code, but this will make it easier to do so, at very little benefit as far as I'm concerned. I've yet to see convincing (to me) examples of code that would benefit from this proposal.

I understand that you and others feel differently about this issue.

2 Likes

I like what this pitch wants to achieve, but -1 the current implementation - perhaps some examples would make me more positive but my feeling is that functional code that liberally used this feature would quickly become a mess of backticks and become quite ugly/confusing, and where it was used sparingly, escaped calls/definitions would look jarring and inconsistent with everything else.

Test code is one place where I can definitely understand the benefit - but even there, I can only see people really using it for the names of their test methods, and those are rarely referenced from other code, which sort of defeats the point.

There's also the benefit of referencing operators, but there might be better ways to enable that.

4 Likes