SE-0200: Enhancing String Literals Delimiters to Support Raw Text

I had the same concern re #, but on the other hand, isn't the interpolated string literal something like a macro, some kind of compile-time transform? E.g "name: \(name)" -> "name: " + name

1 Like

To be fair, the current behaviour of backticks is also to escape an otherwise-reserved identifier. So it wouldn’t be entirely out of place.

3 Likes

I’d be disappointed to see a switch to ` as the delimiter at the last minute. In practice, using a toolchain:

You very quickly get used to it and I don’t believe using # would preclude application for macros in the same way it lives along side existing constructs such as #selector. Using a quote character will also confuse external editors. There may be other alternative characters to consider but few have the desirable heft of #

2 Likes

And there's nothing that says we can't use # for custom macros too, because those will have identifiers. Or parentheses. What we're reserving here is #".

4 Likes

Yes, this. I'm assuming the general idea is that macros could subsume our existing #identifier and #identifier() constructs; if that's the plan, this wouldn't interfere with that from a technical perspective, it would just use the same symbol for something else. Since there's already other stuff doing that like #if, I don't think that's a big deal.

2 Likes

I agree, there is no technical reason I can see why # is undesirable here, and I agree that its visual weight is an asset in this case IMO.

4 Likes

Unless non-ASCII chars are on the table (and I assume they're not), I can't think of another sigil that’s any better. Long live the number sign / hash / pound / octothorpe!

3 Likes
  • What is your evaluation of the proposal?
    Very much in favor

  • Is the problem being addressed significant enough to warrant a change to Swift?
    Yes, the new syntax will ease the use of inlined Strings in Swift, and make the inlined text easier to read. The use of inlined text is common enough that this will benefit a lot of Swift users. For instance I often see inlined json used in test cases. Not having to escape this json will be awesome, and the result much easier to read.

  • Does this proposal fit well with the feel and direction of Swift?
    Sure

  • If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?
    I have not

  • How much effort did you put into your review? A glance, a quick reading, or an in-depth study?
    Not an in-depth study, but I read various steps of the proposal and followed the discussion.

1 Like

What is your evaluation of the proposal?

+1

It is reassuring that # delimiters have already been tried and tested in Rust.

There aren't many alternatives available in ASCII:

  • dollar signs (e.g. $$" echo "$PATH" "$$ ) might be embedded too frequently;

  • underscores (e.g. _"{ "id": "\_(idNumber)" }"_ ) might be too lightweight.

Is the problem being addressed significant enough to warrant a change to Swift?

Yes.

Does this proposal fit well with the feel and direction of Swift?

Yes.

How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

A quick reading of the proposal.

This is a solid proposal — it works, it’s visually pleasing, and it’s easy to remember. I appreciate the research that went into it and the comprehensive writeup. In particular, I like that it still allows for string interpolation.

One aspect gives me concern — multiline strings. The #””” symbol doesn’t seem to fit in with the rest of the proposal. What about using the same delimiter even in the multiline case? E.g., if #” (or ##” or ###”, etc.) is the last symbol on the line, then the literal would be assumed to be multiline and the compiler would look for the closing delimiter on a subsequent line. This would would make the proposal easier to remember.

•	What is your evaluation of the proposal?

+1

•	Is the problem being addressed significant enough to warrant a change to Swift?

Yes

•	Does this proposal fit well with the feel and direction of Swift?

Yes

•	If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?

Better

•	How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

I read the pitch discussion and the final proposal.

A small change could be made to the implementation to not terminate the string if the closing delimiter (in this case ”#) is followed immediately by another hash. Would this make sense?

It could be done that way, but it would be more difficult to reason about. It would also introduce a security issue because a non-printing character could then invisibly change where a string ends.

That's an interesting remark about security. You know this is already a problem with today's strings right? I can even fool the the syntax highlighter here:

print("""
Validating password...
"​"")
guard user.validatePassword(password) else {
	fatalError("get out!")
}
print("​""
Password is valid!
""")

I'm not sure what the solution is.

I believe non-printing and zero-width characters are well-defined Unicode categories; we could not allow those to appear immediately before or after string delimiters, maybe. While more complicated, I feel like the rule @johnno1962 proposed is more in line with what I'd expect the syntax to do.

1 Like

I might expect it to be a warning, perhaps. I'm not convinced that the string terminator should ever not terminate the string.

(Is there any context in Swift in which it is legal for two strings to be immediately adjacent, e.g. "foo""bar", or immediately followed by a compiler directive, e.g. "foo"#bar()?)

A similar situation occurs in something like """foo "bar"""", where if you have more than 3 "s at the end of the string one might expect the string to be delimited by the final three quotes rather than the first.

1 Like

That gives error: multi-line string literal closing delimiter must begin on a new line, so I don't think there’s a precedent there to follow one way or the other.

In my """ example above, the zero-width space I added is neither in the non-printable category nor is it outside the string. You can construct a similar example with:

print(##"Validating..."​##); try validate(password); print(#​#"Password is valid!"##)

As long as the delimiter is more than one character, you can split it with something invisible and the fake separator becomes part of the string, along with the code between the two strings.

This is more subtle with multiline strings though, because otherwise the code to disable must be on the same line to avoid a syntax error. Note how I had to put everything on one line in this last example.

It seems hard to reason about if the string delimiter doesn't necessarily close the string. This seems like a largely theoretical concern, since I don't know in what context you would be trying to write #" ###""### "#, so I don't think it is worth complicating the implementation and mental model.

I think it might also make mistakes and diagnostics more confusing to users, because accidentally closing your string with too many # characters will wrap the rest of the code in the file in a string, and probably give you non-local errors instead of an error pointing right to the stray #. The current multi-line string implementation, for example, gives you error: unterminated string literal (pointing at the start of the string) and error: expected '{' at end of brace statement (pointing at the end of the file) if you fail to terminate it. This might be possible to improve with some heuristics, but it's inherently difficult because someone might be using raw strings to hold Swift code, which makes it hard or impossible to know where the end should be.

3 Likes

While any potential security problem has to be taken seriously, concern here seems a little overblown. If you’re editing in Xcode the Syntax highlighter isn’t fooled for a second as it uses the same code as the compiler.
Untitled
This problem isn’t really related to the topic at hand though and is a weakness of any multiple character delimiter.

After more thought I support the change to processing termination put forward as #" "######" “# seems to be something people expect to work. Re confusing people the error messages when you try to use this string without the change are more confusing than those generated by accidentally adding an extra # to a string that should terminate. What tips it for me is that of the two interpretations one seems to be one the people naively expect and the other something that could never be valid Swift so why shouldn’t the compiler choose the an interpretation that compiles.

I guess the Core Team can make the call on this one.

1 Like