SE-0200: "Raw" mode string literals


#92

I don’t think I understand your example, which really just be #rawString("c:\windows\system") under any version that I remember anyone proposing. You probably want to use an example where the default delimiter of " would require too much escaping inside the string.


#93

Yes it’s actually the same as @hartbit’s proposal, reading back the discussion. :slight_smile:
For me #rawString has a preference over #raw though.

A more specific example:
#rawString(*"I'm a panda," he says, at the door. "Look it up.", he continued*) with * being the delimiter.


(Xiaodi Wu) #94

Why the rush? Of course we can make a decision in a week, but can we make the best decision for Swift? Like @Nevin I don’t think that this idea is at that point.


(Josh Caswell) #95

Those aren’t really the same kind of literal. They’re not even literals, in fact. They’re a very big dose of sugar for a platform-specific object construction. A string literal is literal, literally; it’s WYSIWYG. Doubly so for this kind of “raw” literal, where escape sequences aren’t present.


(Moris Kramer) #96

How about using single quote this way:

‘some\raw\string’ => some\raw\string

and in constancy:

'''
long raw "string"
''' 
=> long raw "string"

Also great for escaping double quotes


(Gwendal Roué) #97

How about using single quote like:

Good idea. But maybe we’d be happy using single quotes for character literals.


(Moris Kramer) #98

could both work?


(Gwendal Roué) #99

I’m not sure both could work, since 'a' could mean a raw string, or a character. But I didn’t link to the above pitch to diminish your idea. It was just an attempt at putting single quotes in perspective. We may need character literals as much as we need raw strings, and since Swift has a C-like syntax, single quotes would be excellent candidates for character literals. Now, I’m open to all suggestions :-)


(Moris Kramer) #100

By both it could interpret:
Single character is always “raw” ascii number, C style.
More then one character it is a “raw” string.

Alternatively using ` for escaping chars or strings could work.


(Erica Sadun) #101

I find it a given that this would be useful to me in a variety of contexts. And providing a feature does not mandate that the user will use it well.

So yes. Like that one. Or like X, _, , or 💩.


(Michael Ilseman) #102

Thank you for providing justification. I am undecided and have a few clarifying questions regarding these reasons.

Could you elaborate a little more on this point? What about custom delimiters makes Swift tooling especially difficult? Is this an argument regarding inherent complexity of lexing (e.g. more lexer state) or incidental complexity of Swift’s current lexer (e.g. implementation limitations)?

Do you mean literally in source code or in the presentation of source code (e.g. by an IDE)? Very long multi-line string literals can also have this same effect when viewing a portion of literal source code (e.g. a diff).

Why do custom delimiters seem especially egregious to you? Why does this reasoning apply to them but not multi-line literals?

The ability to nest a literal inside another literal, e.g. as others upthread pointed out, a literal containing Swift source code. However, control over the delimiter would allow for careful nesting. (Not arguing pro/con, just a use case to help flesh out explicit rationale).

Note that the Swift compiler is fundamentally incapable of determining what is or is not a single grapheme at compile time, as that designation depends on the version of ICU present at run time. This is how Swift 4 apps get support for new emoji in new OSes. The Swift compiler attempts an approximation, often overly accepting of potential new emoji sequences.

While I enjoy the fact that emoji open up a whole new design space for custom delimiters (with snarky undertones), there’s a minor catch regarding grapheme-length restrictions.


(Erica Sadun) #103

Fair enough. I still think that one can find a single character ascii delimiter that will not be used in a string in nearly every circumstance. I also welcome other solutions and remain positive about the utility of this proposal regardless of its final form.


(John Holdsworth) #104

Hi Michael,

I don’t find custom delimiters egregious but they seem to me a feature on top of a feature at a time when the proposal is already struggling to wriggle under the complexity against benefit bar.

It’s appropriate that we discuss them now and come up with a couple of designs to make sure we don’t close off future directions the language could take but for me I don’t think they need to be considered an essential part of this proposal at this time.

There are two designs in play at the moment both of which I’m sympathetic to:

#raw(Xa stringX)
where X is an ascii character for largely technical reasons involving the compiler not having the same support for segmenting graphemes as Swift itself as you mention and a “(“ opening delimiter character would map to “)” at the end etc. This would not be too difficult to implement but what would the syntax be for multiline raw?

and
#raw(delimiter: “##”, ##some code.print”hello”)##)
…where there is a default for the delimiter string of “\””. This involves parsing a string to parse a string and as you as much as anyone will know would be more of a departure for the existing lexer code having worked on it with me. I am not at all keen on the idea of strings not being delimited by “ however. This gives users too much license and would be a step in the direction of the Perl.

raw-multi-line-custom-delimited strings would suffer from the same problem in term of loss of context outside the IDE as multiline already does. Nothing specific there.

So in summary, in my own mind I built the case not including custom delimiters as an essential part of the proposal due to the following.

  • There is no doubt they would be more complex ranging from slightly to significantly. I’m just applying Occam’s Razor.

  • There is no real need for them as we already have a syntax that accept any character other than the sequence “) and newline.

  • We already have a couple of designs in hand that can still be introduced at a later date if this turns out to be a pressing requirement.

  • Custom delimiters are actually a bad thing in and of themselves in terms of code legibility though that’s just my personal opinion.

I find arguments about not being able to paste in Swift code using raw strings into Swift code using raw strings a little contrived myself. Like saying give me a vaccine for disease A provided the disease isn’t A. This is all academic anyway when, if this really is a requirement you should really be using a resource file and loading it from disk given the data is completely static.

We’re just trying to maximise what is possible without having to stand on our heads in terms of an implementation and bringing unwarranted complexity into the language. Each new increment of complexity has to be justified in terms of it’s actual utility.


#105

There is a third design which several people have discussed in this thread, with no octothorpe involved.

It does not reflect well on a proposal author when they ignore the existence of alternatives that have been put forth.


(^) #106

hello i was summoned


(Michael Ilseman) #107

Care to share?

Is this third approach the triple-double-quote-custom one hidden under the triangle here? I don’t recall seeing further discussion of this approach, but it is a long thread.

Or do you mean this one? It appears actively at-odds with your prior suggestion. There was discussion on this fourth options, however, and it seems like the discussion there died down (perhaps it was centered around old style r"" syntax).

Or do you mean the one using single quotes with an intermixed delimiter like this one? Would this be a distinct fifth option, or the same as the third?

Or do you mean the approach using single quotes without delimiters here? These fifth and sixth approaches do rely on claiming single quote syntax, which is ballooning the scope a bit, and are mentioned in the alternatives considered section.

Or do you mean the >4 double quote suggestion here? This sixth approach looks promising, but it can’t be the third design you’re talking about because I didn’t see several people discussing it.

Fortunately, ad-hominem attacks against the author are invalid criticisms of a proposal. Similarly, ad-hominem attacks against a reviewer are invalid criticisms of a review.

@johnno1962, could you update the alternatives considered section? What do you think about some of these alternatives?


(John Holdsworth) #108

I think I’ll let this thread run it’s course now and try to encapsulate it in the un-merged pseudo-proposal for the core team to evaluate at the end. There isn’t much more I can add unless anybody has a specific question.


(Ben Rimmington) #109

@johnno1962, how about a way to choose the escaping ASCII character (e.g. $ instead of \).

This might be in the form of some extra syntax immediately after the opening """ delimiter (which is currently disallowed by the SE-0168 rationale).

It could remove the need for custom delimiters, because a literal """ could still be escaped (e.g. $""" instead of \""").


(John Holdsworth) #110

Interesting… but doesn’t it just shift the problem from \ to $. If you’re having trouble pasting in something that might be containing “”” or “””) you have to look again at why you’re doing it.


#111

The entire class of possibilities involving in-line custom delimiters, with some rules about what constitutes a delimiter. You listed and linked to several of them.