SE-0200: Enhancing String Literals Delimiters to Support Raw Text

edit: I just realized I missed the review window, my apologies for the noise.

original belated review
  • What is your evaluation of the proposal?

Strong +1.

(I support the suggestion to refer to this less as "raw strings" and more as "customized delimiter/escape", to avoid confusion)

  • Is the problem being addressed significant enough to warrant a change to Swift?

Yes, delimiter/escape control is the next logical extension to string literals and important to the language.

  • Does this proposal fit well with the feel and direction of Swift?

This is the most Swifty solution to this problem I've seen. It generalizes delimiters/escapes into a simple and logical approach, while current syntax (common case) is just a zero-#.

This is an elegant extension, as @jrose mentioned. It was worth the wait and long threads to arrive at this solution.

  • If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?

This compares favorably to any other approach I've seen. Delimiters are symmetric and balanced, escapes are obvious and supported. All this without requiring a new kind of literal.

  • How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

Reading of earlier iterations and relevant threads over the years, investigation of other languages.

1 Like

A revised rule that works out as you suggest might indeed be superior. However, as you point out, it strictly takes what would be invalid code today and makes it valid, so there is no reason it has to be considered now.

Moreover, a similar situation applies to multi-line literals, as Joe Groff points out. For that matter, since this proposal is a generalization of existing string literal syntax, the same rule ought to apply to them as well, and in that case it would require another rule as to the opening delimiter also:

" Hello, #" #### "# World! "

This is all to say that I believe such a change ought to be considered not as a last-minute change here but as a standalone proposal. It should apply consistently to all types of string literal, and the consequences of the change (such as the security issues we’ve just discussed) should be considered more thoroughly along with any necessary mitigations.

It is not enough that one IDE can highlight the code correctly. That is not in question. The issue is that readers have absolutely no hope of parsing this correctly, and tools that don’t rely on compiler facilities to highlight code may or may not be able to help the reader. (Nor, mind you, is syntax highlighting alone good enough as the sole defense against a security issue; color changes alone—even if consistently rendered in all contexts—are insufficient as indicators of critical information.)

This issue applies to all multi-character string delimiters, and so does the suggested change in parsing rules, as I discuss above. They are inextricably linked and ought to be considered together. Just because we already have a security issue with multiline string literals doesn’t mean that we ought to extend it further.

That said, I can see a straightforward mitigation: without resorting to errors or warnings, non-printing characters should be ignored in parsing string delimiters (or almost anything else in Swift, for that matter—Apple documentation inserts invisible optional line breaks between words in camel-case method names for better line breaking: it should be possible to copy and paste method names from the documentation into one’s code and have the optional line breaks ignored, although it’d be important then not to break the line there).

Again, I think these issues are well deserving of their own review, as it’s a sufficiently large topic and strictly an enhancement to this proposal.

3 Likes

Except that we have a review open (just?) for which this is relevant. There doesn’t seem to be general support for the idea anyway so we can quickly reach a decision point and move on. The security issues need to be addressed though, at length but separately as a bug in the current implementation. I don’t see the two questions being that tightly coupled.

Having looked at what would be involved trying to accommodate zero width characters inside a delimiter I’d not recommend trying to ignore them but check for any shenanigans and raise an error.

If this is true then surely it's just an artefact of the current implementation that should be improved. It seems inherently easier to diagnose and point directly at a stray/extra # outside of a string literal than it is to try to find where a user accidentally wrote an incorrect delimiter and didn't close a multiline string.

Fair enough, I’ve given up on the “no # after closing delimiter" idea and added a diagnostic. We’re beginning to look at the security problem in the PR to see if we can find a solution to put forward. The first step is find a way to determine if a given unicode point is zero-width/invisible and the ICU library doesn’t seem to have an api for this. This list needs to be complete. One approach is by rendering attributed strings to determine the set ahead of time. Does anybody know a better way?

I think @xwu is right about this being a more general issue that should be tackled holistically. For example, as far as I know identifiers are still not even normalised yet, so there are much more fundamental issues here:

let café = 1
let café = 2
print(café) // 1
print(café) // 2

There would ideally be a consistent set of normalisation/parsing rules that deals with these kind of issues (e.g. should zero-width/invisible characters be uniformly ignored?). There have already been several discussions about these issues, and @xwu mentions a draft proposal in the PR you linked.

2 Likes

Hello all,

This proposal has been accepted. Thank you, everyone, for participating! [Accepted] SE-0200: Enhancing String Literals Delimiters to Support Raw Text - #2

Doug

6 Likes

The implementation has been merged and is available in the swift.org nightlies if you want to help find some bugs.

I guess we can finally close this radar!
http://www.openradar.me/17970377

TTFN

8 Likes