SE-0200: "Raw" mode string literals

(Saagar Jha) #78

I don’t think I’m getting you. What makes raw string literals different from file or color or image literals? As far as I’m concerned, they’re all psuedo-functions that convert a string of characters in the source file into a literal type.

(James Froggatt) #79

#rawString(delimiter: "###", ###some.code("hello")###)
and
'delimiter'some.code("hello")'delimiter'
are fundamentally just different spellings of the same thing. They’re literals, they have magic rules which allow a string to use a custom delimiter.

Currently, #___Literal is used when true literals cannot be represented in text. The actual appearance of the literal is left up to the IDE. The textual representation can be verbose, since most users wont see it.

Here, #rawString is the literal, what users will be expected to manually write out, and later read. This should be the main consideration in finding a spelling.

This isn’t just a function with custom UI, this is a fundamental syntax which needs language support.

For this reason, I think we shouldn’t be looking to #___Literal spellings at all, but rather existing, first-class literals.

2 Likes
(Saagar Jha) #80

I’d actually say that 'delimiter'some.code("hello")'delimiter' needs #rawString around it as well. What it’s representing is not something that can be represented in normal text. The Swift compiler is still performing a transformation from 'delimiter'some.code("hello")'delimiter' to some.code("hello"), just as #colorLiteral(red: 0, green: 0, blue: 0, alpha: 0) ends up becoming NSColor(srgbRed: 1.0, green: 0.0, blue: 0.0, alpha: 1.0). “Text” and “raw strings” look very similar, but they’re not the same thing because raw strings are meaningless without a delimiter.

(James Froggatt) #81

I’m afraid I can’t follow your reasoning.

The difference here is that #colorLiteral(red: 0, green: 0, blue: 0, alpha: 0) could very well just be a function, implemented differently based on platform. The IDE doesn’t need the # there to apply custom UI, although it helps resolve ambiguity. In other words, despite the name, these aren’t actually literals.

A raw string cannot be represented as a function, otherwise we wouldn’t need this proposal.

1 Like
#82

One very simple solution is to extend the multiline string syntax to allow any number (at least 3) of double-quote characters as the delimiter:

let myVeryExcellentString = """"""""""""
    let address = """
        John Jacob Jingleheimer Schmidt
        555 Main Street
        Lake Wobegon, MN
        """
    print("\(address)")
    """"""""""""

If at least 4 double-quotes are used, then it is a raw string where nothing is escaped.

4 Likes
#83

Trying to summarise this discussion, I would say there is an underlying tension between people who:

  • Think these strings will be rare, so are happy with some semi-verbose to verbose marker (#rawString, #rawStringLiteral, #stringLiteralWithoutEscaping(customDelimiter:, …), several dozen quotation marks in a row plus the string has to be multi-line, etc).
  • Think that these strings will be common, so want a concise marker (r"…", \"…", '…', etc).

#raw is probably somewhere in the middle there, but it’s unclear if that puts it in the Goldilocks zone or no man’s land.

As noted by several people, including the author, the original proposal didn’t really do itself any favours here because it lacked good examples, especially since it’s generally agreed that regular expressions deserve special attention of another form. The updated proposal has some more examples, which are somewhat helpful, but aren’t really definitive for me.

A related issue is the question of how many special string forms Swift should have. If you want several more, then a syntax that generalises would be preferred (e.g. verbose: #specialString(arguments:…), concise: r"…", s"…", t"…", …). If you think raw strings are about the last form needed then something simpler will suffice (e.g. verbose: #rawString(…), concise: \"…", '…' if it’s not reserved for character-like things).

7 Likes
Pure Bikeshedding: Raw Strings (why yes, again!)
Pure Bikeshedding: Raw Strings (why yes, again!)
Pure Bikeshedding: Raw Strings (why yes, again!)
(John Holdsworth) #84

Thanks for this. It is a very concise summary of exactly the status of this review.

I’ve not been able to get the new version of the proposal merged so we’ll have to proceed as is. The bulk of the multi-line string review was discussed with reference to @brentdax’s excellent rewrite which was never merged which was a shame.

The current version of the proposal I have PR’d is here now

As before if you can think of any worthwhile updates please file a PR.

Very few changes with respect to last revision except I have added that the proposal does not put forward custom delimiters due to their complexity.

#85

I am a strong -1 because of this.

Custom delimiters are a minimum requirement for in-source raw strings. The only other viable option is to externalize raw strings into their own files, and introduce syntax for assigning the text content of a file to a variable at compile-time.

The best possible outcome for this proposal now is that it gets returned for revision. Second-best would be “Rejected on the specifics, but the idea itself is not rejected.”

1 Like
(Erica Sadun) #86

If @johnno1962 were to change that to allow a single-character custom delimiter as suggested upthread, namely:

#rawStringLiteral(X"unescaped raw string"X)

where X could be any Character chosen by the end-coder and known a priori to not be included in the literal string (with a similar version X""" and """X I suppose), would that flip your negative response?

1 Like
#87

Any Character? You mean like this one? C̷̙̲̝͖ͭ̏ͥͮ͟

I suppose one Character is technically sufficient, but it would take a lot to convince me that it is the best option.

(John Holdsworth) #88

Perhaps custom delimiters shouldn’t have been excluded so abruptly without explanation but there are a few problems with it including it in the proposal:

  • It opens up distracting dimension of discourse about what the syntax should be though those mentioned already upthread are perfectly fine as long as they allow a default delimiter.

  • It would be very difficult to implement in the Swift Lexer and IDE, a sure sign that it would be difficult to document and increase the burden on the language.

  • A custom delimitered string is no longer readily recognisable as a string. This alone clinches it for me.

Besides, I’m struggling to think of a use case that wouldn’t be covered by raw multi-line literals which have the very distinctive closing delimiter “””).

#raw(“””
	<X>
	“””)

What would <X> be that wouldn’t be covered by this syntax? There is always the option of using a resource file.

In the single quoted case any character is allowed including “ other than newline or the sequence “).

So, we really aren’t loosing anything by excluding custom delimiters.

#89

John, additions to the Swift language are *permanent*. If we are going to add something, it is imperative that we thoroughly explore the design space when deciding how it should look and how it should function. Syntax is not a “distracting dimension”, it is equally as important as semantics.

That debate *also* needs to occur.

It depends on the rules for the delimiter.

Literally anything containing three consecutive quotes followed by a closing parenthesis. For example, a snippet of Swift code from a program which uses raw strings. I would go so far as to say that is one of the *driving* use-cases.

…except for all the things that you *are* losing by excluding them.

• • •

I am thoroughly convinced that this pitch is not ready for primetime, and needs to go back to the discussion phase so the community can hash out a preferred direction.

There is no *rush* to get raw strings into the language, but there is a responsibility to get them *right*.

2 Likes
(John Holdsworth) #90

That may be but I personally think we can get to a decision this week either way. This is not a complex feature to design and we’re collecting many valuable opinions while we have people’s attention as a review.

If we exclude custom delimiters for now in the interests of simplicity it does not preclude adding them later if they use this syntax.

The real question is whether raw strings are sufficiently useful to include in the language at all rather than the design. We can get to a conclusion on this question quite quickly.

#91

Personally I think #rawString(delimiter:) marks a point where simply escaping individual characters becomes more attractive and readable, imo.

#rawString(delimiter: "\"", "c:\windows\system")
vs
"c:\\windows\\system"

I quite like @Erica_Sadun’s direction. Have #rawString recognize the first Character as the delimiter. For example: #rawString('lorem ipsum') would recognize ' as the delimiter. Admittedly I’m not sure how much ‘magic’ this would require on the compiler’s end.

#92

I don’t think I understand your example, which really just be #rawString("c:\windows\system") under any version that I remember anyone proposing. You probably want to use an example where the default delimiter of " would require too much escaping inside the string.

#93

Yes it’s actually the same as @hartbit’s proposal, reading back the discussion. :slight_smile:
For me #rawString has a preference over #raw though.

A more specific example:
#rawString(*"I'm a panda," he says, at the door. "Look it up.", he continued*) with * being the delimiter.

(Xiaodi Wu) #94

Why the rush? Of course we can make a decision in a week, but can we make the best decision for Swift? Like @Nevin I don’t think that this idea is at that point.

(Josh Caswell) #95

Those aren’t really the same kind of literal. They’re not even literals, in fact. They’re a very big dose of sugar for a platform-specific object construction. A string literal is literal, literally; it’s WYSIWYG. Doubly so for this kind of “raw” literal, where escape sequences aren’t present.

(Moris Kramer) #96

How about using single quote this way:

‘some\raw\string’ => some\raw\string

and in constancy:

'''
long raw "string"
''' 
=> long raw "string"

Also great for escaping double quotes

1 Like
(Gwendal Roué) #97

How about using single quote like:

Good idea. But maybe we’d be happy using single quotes for character literals.