SE-0275: Allow more characters (like whitespaces and punctuations) for escaped identifiers

The review of SE-0275 — Allow more characters for escaped identifiers begins now and runs through January 20, 2020.

Reviews are an important part of the Swift evolution process. All review feedback should be either on this forum thread or, if you would like to keep your feedback private, directly to the review manager (via email or direct message in the Swift forums).

What goes into a review of a proposal?

The goal of the review process is to improve the proposal under review through constructive criticism and, eventually, determine the direction of Swift.

When reviewing a proposal, here are some questions to consider:

  • What is your evaluation of the proposal?
  • Is the problem being addressed significant enough to warrant a change to Swift?
  • Does this proposal fit well with the feel and direction of Swift?
  • If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?
  • How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

Thanks,
Joe Groff
Review Manager

13 Likes

This feature looks great, especially for referring operators as Int.+. Thanks!

1 Like

+1, with a few reservations.

First, In addition to disallowing \r and \n in escaped identifiers, I think characters from this list like U+000B (vertical tab) and U+2028 (line separator) should also be disallowed. I don't think anybody is likely to attempt to use them in practice, but it would be nice to ban all multi-line identifiers consistently.

Second, as @Joe_Groff noted in the pitch phase I think it's important to establish how external tooling/runtime API/etc. is expected to parse identifiers if this proposal is accepted, to avoid inconsistent behavior. I think these should probably require the backticks whenever the compiler would, but it would be nice to have clear guidance ready for tool authors if the proposal is accepted.

I think so. I find the ability to more easily reference operators pretty compelling, especially since it supports the use cases @rxwei described in Operator member syntax - #9 by rxwei. Beyond that, the pitch thread contained enough reasonable, if relatively small, use cases to convince me it has widespread utility.

I see this as a natural extension of the ability to escape keywords as identifiers. In both cases, the backticks are a way of writing a keyword which would not otherwise parse in the given context.

I can't recall ever using a similar feature in another language.

I followed some of the pitch discussion and read the final proposal,

6 Likes

What is your evaluation of the proposal?

+1. I feel like this is a logical step forward for escaped identifiers.

Is the problem being addressed significant enough to warrant a change to Swift?

Yes, as I stated on the discussion thread, there are certain classes of APIs which will benefit from being able to use more expressive identifiers:

Does this proposal fit well with the feel and direction of Swift?

Given that escaped identifiers already exist, this already feels like natural Swift to me. I believe it fits well.

If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?

I can't think of any examples at the moment.

How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

I followed along and contributed to the original discussion.

1 Like
  • What is your evaluation of the proposal?

+1. I'm a huge fan of this general direction. It will make test method names much easier to read. This is a basic need and should not require a library dependency (such as Quick) to fulfill.

That said, I haven't given much thought to the detailed design issues @owenv mentioned in his review. I trust the core team to make the right decision regarding detailed design.

  • Is the problem being addressed significant enough to warrant a change to Swift?

Yes, very much so.

  • Does this proposal fit well with the feel and direction of Swift?

Yes

  • If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?

I haven't

  • How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

A quick read, although I asked for this feature myself on Twitter a while ago. I'm happy to see it happening.

1 Like

What is your evaluation of the proposal?

I generally like this proposal, however I think that backslash \ (U+005C) should be included in the list of invalid scalars so that it is usable later for escape sequences (whether or not Swift ever defines any in identifier context).
Or even directly define in this proposal some escape sequences in identifier context, such as ones for \ and `.

My reasoning is that since we are developing a syntax which allows almost any text as an identifier, so we should make certain to leave room to allow for the rest.

Does this proposal fit well with the feel and direction of Swift?

Yes.

How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

I read the pitch discussion and the final proposal.

6 Likes
  • What is your evaluation of the proposal?
    +1, but I think some extra restrictions on characters allowed in identifiers are important. Someone mentioned whitespaces, but some others could be interesting as well, for instance the RTL marker. Is there an accepted set of non-flow-affecting-single-line Unicode characters? Also will escaping (backslash-`) be allowed?
  • Is the problem being addressed significant enough to warrant a change to Swift?
    Yes
  • Does this proposal fit well with the feel and direction of Swift?
    Yes
  • If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?
  • How much effort did you put into your review? A glance, a quick reading, or an in-depth study?
4 Likes

+1, it seems like a logical thing to do. I agree with @owenv that other line separators defined by Unicode should also be banned.

I also wonder if we should also tweak how we emit diagnostics, because currently if I have:

func `foo`() {}
func foo() {}

then diagnostics do not include the backticks, so I get invalid redeclaration of 'foo()' (instead of '`foo`' but that's hard to read IMO). Should we apply special formatting rules to such functions for diagnostics purposes? It's not a big deal, but with the proposed ability to add whitespaces and other characters, I wonder if there are diagnostics where we could run into a problem with clarity.

  • Is the problem being addressed significant enough to warrant a change to Swift?

I think so. Personally, I haven't ever found the need for this feature in production code, however I think it's helpful in aiding the readability of function names in unit and automation tests and it's something I would personally like to use when writing tests.

In addition, being able to reference operator functions is really nice.

  • Does this proposal fit well with the feel and direction of Swift?

Yes. It seems like a natural continuation of an existing feature.

  • If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?

I have tried it once with Kotlin. I can't recall if there are any differences though.

  • How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

I read the pitch, proposal and looked at the implementation code.

3 Likes

The proposal is a wonderfully elegant solution to a handful of rough edges, and a natural extension of what the grave accents have been doing since the beginning. I probably even tried to use them like that when I was first learning Swift.

Disallowing bi‐directional controls would be unfortunate. They are one of the reasons I want this. (Right now you often need to litter code with /*[LRM]*/ to keep it legible, even if the only right‐to‐left characters are in string literals. The more places such controls can be used, the better.)

You can make your code perfectly dizzying with ASCII alone—let x = Ill1lI1||l1II||l1. Coders don’t need to be told not to write such things, and I don’t think they need help with that under Unicode either. If we try to decide which characters would and wouldn’t be useful, it will take a lot of effort and it will be for very little benefit. We also risk getting it wrong anyway just for lack of foresight.

@owenv is right though that the newlines should be treated uniformly. The line separator should fall in the same category as ASCII’s line feed. (But I don’t have an opinion on whether they should be both disallowed or both allowed.)

2 Likes

What is your evaluation of the proposal?

I think the proposal should be clearer about operators. Take this for instance:

func `+`(lhs: Int, rhs: Int) -> Int

Is this a regular function with the name + or is it an operator function? Operators have more constraints. In particular you need an operator definition to come first, the number of arguments is fixed, and you sometime need to prepend prefix or postfix before the function declaration.

Another point for operator is that you need to worry about whether the argument names matter at the call site. Should you call the function like this?

Int.`+`(1, 2)

or like this?

Int.`+`(lhs: 1, rhs: 2)

Once this is answered, we need to think if this can be used to define an operator too:

extension Int {
    static var `+`: (Int, Int) -> Int
}

Probably not, but does that make this declaration illegal?

I think the proposal need to address these questions.

Is the problem being addressed significant enough to warrant a change to Swift?

I think being able to refer to operators as functions is a big plus. But you don't need to allow arbitrary characters for this, you just need to allow wrapping operator names in +. Considering this issue alone the proposal is overblown.

Other use cases mentioned in the proposal seems contrived to me and not very useful. Do we really need a new feature so you can avoid writing camel case?

func testValidationShouldSucceedWhenInputIsLessThenTen()

The proposal mentions Quick, but looking at Quick I don't see how it'd be replaced by having a function name that includes whitespace and other things. The proposal perhaps only lacks some actual examples.

And is there any use in calling a variable "some var" other than to confuse everyone?

I'm also a bit reluctant to allow almost anything inside those strings. Surely a few well-selected space characters, operator characters and identifier characters would be enough. Is there a reason to allow control characters? nul characters? Byte-order marks? Yet-to-assign-a-meaning characters?

If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?

Nothing that compares.

How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

Read the pitch thread, and gave some thought to operators.

10 Likes

I believe the operator vs identifier questions were resolved in the pitch thread starting roughly here and that it was decided to depend on the first character (i.e. +a is an operator and a+ is an identifier.) Since that just happened to be what the compiler does already anyway, it didn’t end up requiring any work in the implementation. [Edit: I did misunderstand. @adellibovi corrected me farther down.]

It probably should have been mentioned in the proposal though. And @adellibovi can correct me if I got anything wrong in my summary.

1 Like
  • What is your evaluation of the proposal?
    I'm unsure, overall. I definitely want the ability to spell operators (Swift.Int.`+` ) But I can't help but wonder if the complexity that this adds is worth it. It seems like it is but it also seems like a large and fundamental change in terms of identifiers.
  • Is the problem being addressed significant enough to warrant a change to Swift?
    Yes.
  • Does this proposal fit well with the feel and direction of Swift?
    Again, not completely sure but I think so.
  • How much effort did you put into your review? A glance, a quick reading, or an in-depth study?
    I followed the pitch and read the proposal.
4 Likes

I completely buy the use case presented. I think it is definitely the case that a test function with a descriptive name that uses spaces can be more comprehensible and useful to the reader than a corresponding camelCasedTestNameThatIsVeryDescriptive.

However, I do not think this very persuasive use case motivates the extent of the solution proposed. To be specific, I agree with other replies here that there is something of grave concern in expanding the permitted Unicode scalars within backticks to all but a small handful. Other commenters have raised examples of characters that could in fact make any test function harder to read rather than easier to read: this works against the stated motivation for the proposal rather than for it.

Having briefly examined how Kotlin handles it, it seems that backticks in that language permit the use of spaces and a very limited number of other characters that are not otherwise allowed, but nothing as extensive as what we have proposed here.

Therefore, I would suggest that the proposed use cases would be adequately or even better served by a more circumscribed solution:

  • Allow backticks to surround identifiers anywhere that identifiers are used
  • Allow any valid operator character, non-operator identifier character, and space (the one and only ordinary one) to be used between backticks
  • It's an operator if and only if all characters inside the backticks are valid operator characters
  • Otherwise, it's a non-operator identifier

(Yes, there will be cases where a non-operator identifier could be made to look like an operator inside backticks; but lookalike collisions are already possible (and easily accomplished if one is motivated to do so) as Swift does not normalize, and this is beyond the scope of this proposal and should be addressed holistically.)

7 Likes

Just for the record now that there are other ideas flying around:

  • If the proposal were constrained to enable no new characters, but only to allow the grave accents escape operators as well as identifiers. I would be disappointed, but I would still support it.
  • If the proposal were constrained so that the accents only allowed a few privileged new characters (such as only space), I would shift to be in opposition. This would obliterate many of the use cases brought up in the pitch phase (many of which received no mention in the proposal itself). Examples include direct use of paths and URLs, easier code generation from user strings, improved legibility options for foreign scripts (especially caseless ones), etc. On its own, “I’d rather name tests with spaces instead of camel case.” just doesn’t pull its weight in my opinion.

Doesn't Swift use name-mangling when finalizing the results of compiling functions into object code (to support overloads)? And doesn't the normal name of the function affect what the mangled name will be? If yes to both, then doesn't this proposal actually affect ABI to a degree? We would have to define how exotic identifiers are mapped onto the mangled namespace.

1 Like

The mangling for these added symbols is in fact already defined. The same punycode-based encoding that gets used for Unicode characters in names also works for ASCII non-alphanumeric characters.

6 Likes
  • What is your evaluation of the proposal?

-1

  • Is the problem being addressed significant enough to warrant a change to Swift?

No. The ability to do the + an operator portion seems worth doing, but not the arbitrary characters in identifiers.

  • Does this proposal fit well with the feel and direction of Swift?

As the author said, "Swift has a beautiful concise yet expressive syntax."

Beautiful: the back ticks are ugly and make visually parsing the language unpleasant. I would not want to work in a file that was littered with this mess.

Concise: if your identifiers are concise, this feature isn't needed (the long identifiers with spaces being one of the motivating statements).

  • If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?

People mention testing without using something like Quick as a reason to support this but I don't see that this proposal would improve anything in that regard. Prose in identifiers is not an improvement. Add a comment if a concise but descriptive identifier name isn't sufficient.

  • How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

Read the proposal, read all the comments, imagined what a file full of identifiers like this would be like to look at and edit and concluded that there really doesn't seem to be any need for this and the results will be less enjoyable to interact with.

10 Likes

Thanks Owen for the feedback and review!

Good that you raised this and I want to clarify it.
Newlines are indeed scoped out the proposal, even if the grammar just mention \r, \n the rest are also not allowed, some are, in fact, invalid characters and Swift will warn and remove those.

There is available a toolchain if you want to play around it :slight_smile:
https://ci.swift.org/job/swift-PR-toolchain-osx/467//artifact/branch-master/swift-PR-28966-467-osx.tar.gz

Hi @michelf, that is indeed an important topic, so let me explain more.

Expanding the grammar for escaped identifiers will still respect any current semantic constraints.
As an example, $ dollar identifiers are still compiler-reserved names, `$identifierNames` will produce an error. This means that even if the parser allows you to potentially declare something, it does not necessary mean that semantically you can.

Same goes for operators, if an escaped identifier can be "processed", in its context, as an operator then it will be an operator and therefore any logic associated to this set of identifiers will apply.

As for this, it is an operator function, for example moving inside the declaration of a struct the compiler will prompt to Operator '+' declared in type 'Int' must be 'static' as it would do for the a non escaped version.

Thanks Jeremy for giving an insight of the pitch discussion.
I want to slightly correct this sentence and make it clear. We do not explicitly check for the first character, but instead the compiler check if the whole identifier can be an operator (obviously if the first char is not an operator we already know that the whole identifier is not an operator). This choice didn't increase complexity, instead it made simpler as those checks were already in place. For the record, `+a`() would be considered a function.

Yes, referencing operator was considered a nice side effect and not really the main goal.

I was playing around with BDD and you could do (take it as a proof of concept) something like this:

    func `test account has sufficient funds`() {
        given(`the account balance is`(100.dollars)) {
            and(`the card is valid`)
        }
        .when(`the account holder requests`(20.dollars))
        .then(`the account balance should be`(80.dollars))
    }

The different methods (`the account balance is`, `the card is valid` etc..) can be used for setting up the test or asserting a particular condition. I personally find the fact of being able to explore opportunities like this one exciting. I do also understand, and respect, that some of us may still want to prefer the camelCase option (or not).

Said so, Michel, I hope this helps for your review, in particular, I hope it clarified your doubts about operators, if not, please let me know which other questions you may have, I would be happy to reply to those :slight_smile:

Hi TJ,

Thanks for raising the topic about complexity of implementation.
I want to share that the whole proposal fits in a Pull Request with 118 additions and 61 deletions. This may give you a better idea.