multi-line string literals.


(Ted van Gaalen) #1

  let xml = _"<?xml version="1.0"?>
              "<catalog>
              " <book id="bk101" empty="">
              " <author>\(author)</author>
              " </book>
              "</catalog>”_

If I am reading and understanding this correctly:
This solution is still delimiter-sensitive and
breaks if “_ delimiter is found somewhere within the data,
because it is interpreted as end-of-string.

I wrote already about my solution
which solves the above deficiency,
because it does not use delimiters at all.

I have thought it all over and cleaned it up. Here it is again,
hopefully this description is more clear and readable.

Data Line Operators.

For convenience, I call these \\ and \@ : “data-line-operators” .
(of course, they are pseudo operators)
Other two? character combinations for these operators are also possible.

The \@ data-line-operator:
     - takes character data "as-is” without any conversion.
     - respects (includes) source-file line terminators.
     - all spaces in-between and up to the source line's end are included.
     - comments // are not seen as comments but read as data.
     - the \@ on its own (without anything else to the right) is implicitly an empty line.
   
The \\ data-line-operator:
   - converts escaped chars like \t \n and \(var) substitution, as with “normal" string literals.
   - ignores trailing spaces and source-file line terminators.
   - respects // comments on the same line and will not include these as data.
   - the \\ on its own is interpreted as \n (line feed) thus (optionally) eliminating the
      need for \n usage.

Both operators allow 0…n spaces/tabs on its left side,
thus indentation is supported.

Example 1. The \@ operator:

   // 1. multi-line string literal with data lines as is.
         // It loads each line (part) up to and including the source-file-line- end:
         // you can use all available characters without problems,
         // even \\ and \@ thus allowing you to nest e.g. Swift statements...
        
         let xml =
                     \@<?xml version="1.0"?>
               \@ <catalog>
               \@ <book id="bk101" empty=“”> // this is not regarded as a comment.
               \@ <author>//¯\"_(ツ)_//</author>
               \@ </book>
                     \@ </catalog>

   Example 2, The \\ operator:
   // Multi-line string literal with data lines with \n \t etc. respected:

         var str =
                     \\This is line one.\nThis is line two, with a few \t\t\t tabs in it...
                     \\
                     \\This is line three: there are \(cars) // this is a comment.
                     \\ waiting in the garage. This is still line three

The first \@ or \\ must be on a new line, this is an error:

     let str = /@data data data data…...
     /@data………….

A block of \@ or \\ lines must be contiguous: with no other lines in-between.
An empty line or other source line implicitly ends the
\\ or \@ block of lines. There is no terminator.

\@ and \\ lines can be mixed together in the same block.
Should this be allowed?

Imho even easier to understand and simple.

I could make a proposal for this later in May.

@Vladimir: sorry I didn’t respond directly on your email
You’re right. Our ideas about this have some similarity?
Your point:
   "and I believe I can prove we need some start-of-line marker)” :
I think so too, that’s why I suggest the \\ and \@ data-line-operators.
as described here.

Too busy, packing things to move to www.speyer.de
I will read swift-evolution, but will probably not
respond until after the 12th of May or so.

Although away, some feedback would be nice, thank you.

Kind Regards
TedvG

Fri, 29 Apr 2016 08:20:34 -0600 Erica Sadun wrote:

Did you ever really use multiline string literals before?

Yes. I used Perl in the CGI script era. Believe me, I have used every quoting syntax it supports extensively, including `'` strings, `"` strings, `q` strings, `qq` strings, and heredocs. This proposal is educated by knowledge of their foibles.

As outlined in the "Future directions for string literals in general" section, I believe alternate delimiters (so you can embed quotes) are a separate feature and should be handled in a separate proposal. Once both features are available, they can be combined. For instance, using the `_"foo"_` syntax I sketch there for alternate delimiters, you could say:

  let xml = _"<?xml version="1.0"?>
              "<catalog>
              " <book id="bk101" empty="">
              " <author>\(author)</author>
              " </book>
              "</catalog>"_

Other than the underscores (I'm not sold on them but I could live with them), this is my favorite approach:

* It supports indented left-hand alignment, which is an important to me for readability
* It avoids painful `\n"+` RHS constructions

                                             ^ what are RHS constructions ?

···

On Apr 28, 2016, at 4:52 PM, Brent Royal-Gordon via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

* It's easy to scan and understand
* It's simple and harmonious
                            
— E


(Brent Royal-Gordon) #2

Example 1. The \@ operator:

   // 1. multi-line string literal with data lines as is.
         // It loads each line (part) up to and including the source-file-line- end:
         // you can use all available characters without problems,
         // even \\ and \@ thus allowing you to nest e.g. Swift statements...
        
         let xml =
                     \@<?xml version="1.0"?>
               \@ <catalog>
               \@ <book id="bk101" empty=“”> // this is not regarded as a comment.
               \@ <author>//¯\"_(ツ)_//</author>
               \@ </book>
                     \@ </catalog>

   Example 2, The \\ operator:
   // Multi-line string literal with data lines with \n \t etc. respected:

         var str =
                     \\This is line one.\nThis is line two, with a few \t\t\t tabs in it...
                     \\
                     \\This is line three: there are \(cars) // this is a comment.
                     \\ waiting in the garage. This is still line three

There are a lot of reasons why I don't like these.

The first is simply that I think they're ugly and don't look like they have anything to do with string literals, but that's solvable. For instance, we could modify my proposal so that, if you were using continuation quotes, you wouldn't have to specify an end quote:

         let xml =
                     "<?xml version="1.0"?>
               " <catalog>
               " <book id="bk101" empty=“”> // this is not regarded as a comment.
               " <author>//¯\"_(ツ)_//</author>
               " </book>
                     " </catalog>

So let's set the bikeshed color aside and think about the deeper problem, which is that line-oriented constructs like these are a poor fit for string literals.

A string literal in Swift is an expression, and the defining feature of expressions is that they can be nested within other expressions. We've been using examples where we simply assign them to variables, but quite often you don't really want to do that—you want to pass it to a function, or use an operator, or do something else with it. With an ending delimiter, that's doable:

  let xmlData =
                     "<?xml version="1.0"?>
               " <catalog>
               " <book id="bk101" empty=“”> // this is not regarded as a comment.
               " <author>//¯\"_(ツ)_//</author>
               " </book>
                     " </catalog>".encoded(as: UTF8)

But what if there isn't a delimiter? You would't be able to write the rest of the expression on the same line. In a semicolon-based language, that would merely lead to ugly code:

  let xmlData =
                     "<?xml version="1.0"?>
               " <catalog>
               " <book id="bk101" empty=“”> // this is not regarded as a comment.
               " <author>//¯\"_(ツ)_//</author>
               " </book>
                     " </catalog>
               .encoded(as: UTF8);

But Swift uses newlines as line endings, so that isn't an option:

  let xmlData =
                     "<?xml version="1.0"?>
               " <catalog>
               " <book id="bk101" empty=“”> // this is not regarded as a comment.
               " <author>//¯\"_(ツ)_//</author>
               " </book>
                     " </catalog>
               .encoded(as: UTF8) // This may be a different statement!

You end up having to artificially add parentheses or other constructs in order to convince Swift that, no, that really is part of the same statement. That's not a good thing.

(This problem of fitting in well as an expression is why I favor Perl-style heredocs over Python-style `"""` multiline strings. Heredoc placeholders work really well even in complicated expressions, whereas `"""` multiline strings split expressions in half over potentially enormous amounts of code. This might seem at odds with my support for the proposal at hand, but I imagine this proposal being aimed at strings that are a few lines long, where a heredoc would be overkill. If you're going to have two different features which do similar things, you should at least make sure they have different strengths and weaknesses.)

But you could argue that it's simply a matter of style that, unfortunately, you'll usually have to assign long strings to constants. Fine. There's still a deeper problem with this design.

You propose a pair of multi-line-only string literals. One of them supports escapes, the other doesn't; both of them avoid the need to escape quotes.

Fine. Now what if you need to disable escapes or avoid escaping quotes in a single-line string? What if your string is, say, a regular expression like `"[^"\\]*(\\.[^"\\]*)*+"`—something very short, but full of backslashes and quotes?

The constructs you propose are very poorly suited for that—remember, because they're line-oriented, they don't work well in the middle of a more complicated expression—and they aren't built on features which generalize to act on single-line strings. So now we have to invent some separate mechanism which does the same thing to single-line strings, but works in a different and incompatible way. That means we now have five ad-hoc features, each of which works differently, with no way to transport your knowledge from one of them to another:

* Single-line strings
* Disabling escapes for single-line strings
* Unescaped quotes for single-line strings
* Multi-line-only strings with unescaped quotes
* Multi-line-only strings with unescaped quotes and disabled escapes

My proposal and the other features I sketch, on the other hand, does the same things with only three features, which you can use in any combination:

* Single- or multi-line strings
* Disabling escapes for any string
* Unescaped quotes for any string

This kind of modular design, where a particular task is done in the same way throughout the language, is part of what makes a good language good.

···

--
Brent Royal-Gordon
Architechies


(Ted van Gaalen) #3

Hi Brent,

@Dave - Hi Dave, please see at the end of this email.

Thanks for your energetic reply, Brent.
First of all, I think there is a place in Swift for “your” and “mine” proposing solutions
together. I will reply further inline ->

Example 1. The \@ operator:

   // 1. multi-line string literal with data lines as is.
        // It loads each line (part) up to and including the source-file-line- end:
        // you can use all available characters without problems,
        // even \\ and \@ thus allowing you to nest e.g. Swift statements...

        let xml =
                    \@<?xml version="1.0"?>
               \@ <catalog>
               \@ <book id="bk101" empty=“”> // this is not regarded as a comment.
               \@ <author>//¯\"_(ツ)_//</author>
               \@ </book>
                    \@ </catalog>

  Example 2, The \\ operator:
  // Multi-line string literal with data lines with \n \t etc. respected:

        var str =
                    \\This is line one.\nThis is line two, with a few \t\t\t tabs in it...
                    \\
                    \\This is line three: there are \(cars) // this is a comment.
                    \\ waiting in the garage. This is still line three

There are a lot of reasons why I don't like these.

The first is simply that I think they're ugly

To me, that is not relevant:
If something is “ugly” or not is tied to different and unique
personal reference, accumulated by experience and
human instinct and are thus not comparable because no two
beings are the same. Ergo: This voids a discussion between
you and me about this subjective aspect “ugliness”
Still, this aspect should be subordinate to functionality
(in this case the functionality of a programming language).

and don't look like they have anything to do with string literals,

You’re right about that: they are not *string literals* but *data lines*.

but that's solvable. For instance, we could modify my proposal so that, if you were using continuation quotes, you wouldn't have to specify an end quote:

        let xml =
                    "<?xml version="1.0"?>
               " <catalog>
               " <book id="bk101" empty=“”> // this is not regarded as a comment.
               " <author>//¯\"_(ツ)_//</author>
               " </book>
                    " </catalog>

Essentially the above is similar to what I propose with “my” data line,
that is, starting each line with a special character/token. But that is where the
similarity ends: It does not offer processing for *both*:
- "as-is" character data
- character data where escaped characters need to be processed.
   

So let's set the bikeshed color aside and think about the deeper problem, which is that line-oriented constructs like these are a poor fit for string literals.

A string literal in Swift is an expression, and the defining feature of expressions is that they can be nested within other expressions.

That is logically correct, (but desirable in all cases? (readability)) however:
       
As said before, these \@….. and \\….. are data lines, not string literals.
They are not intended to replace string literals. A slightly different concept.

Data lines are just that and -apart from assignment, that is to be loaded in a String variable or constant-
not really intended to take further part in expressions.
however you can still do that, as shown further below.

We've been using examples where we simply assign them to variables, but quite often you don't really want to do that—you want to pass it to a function, or use an operator, or do something else with it. With an ending delimiter, that's doable:

  let xmlData =
                    "<?xml version="1.0"?>
               " <catalog>
               " <book id="bk101" empty=“”> // this is not regarded as a comment.
               " <author>//¯\"_(ツ)_//</author>
               " </book>
                    " </catalog>".encoded(as: UTF8)

    I do experience this as being “ugly”, but again, this is personal.

But what if there isn't a delimiter? You would't be able to write the rest of the expression on the same line. In a semicolon-based language, that would merely lead to ugly code:

  let xmlData =
                    "<?xml version="1.0"?>
               " <catalog>
               " <book id="bk101" empty=“”> // this is not regarded as a comment.
               " <author>//¯\"_(ツ)_//</author>
               " </book>
                    " </catalog>
               .encoded(as: UTF8);

In “my” case this would be:

   let xml =
                     \@<?xml version="1.0"?>
               \@ <catalog>
               \@ <book id="bk101" empty=“”> // this is not regarded as a comment.
               \@ <author>//¯\"_(ツ)_//</author>
               \@ </book>
                     \@ </catalog>
                          .encoded(as: UTF8)
  

But Swift uses newlines as line endings, so that isn't an option:

  let xmlData =
                    "<?xml version="1.0"?>
               " <catalog>
               " <book id="bk101" empty=“”> // this is not regarded as a comment.
               " <author>//¯\"_(ツ)_//</author>
               " </book>
                    " </catalog>
               .encoded(as: UTF8) // This may be a different statement!

That is *not* the case with the \@ data line concept, because the
line ending is part of the data line and is therefore “swallowed”
by the compiler, generating an intermediate variable before it continues
with whatever might be waiting after that.

You end up having to artificially add parentheses or other constructs in order to convince Swift that, no, that really is part of the same statement. That's not a good thing.

I think you are wrong here, () are not necessary: Whitespace and line ends are allowed in-between, for example

     let ar = [4,5,6,3,2]
          .sort()

is perfectly ok.

and so would even be:
let strArray = [
    \@………………..
    \@…………
    \@………………..

···

On 30.04.2016, at 09:43, Brent Royal-Gordon <brent@architechies.com> wrote:

,
    \\………………..
    \\………………..
    \\…………
    \\………………..
    \\………………..
,
    \@…………………
] .sort()

(This problem of fitting in well as an expression is why I favor Perl-style heredocs over Python-style `"""` multiline strings. Heredoc placeholders work really well even in complicated expressions, whereas `"""` multiline strings split expressions in half over potentially enormous amounts of code. This might seem at odds with my support for the proposal at hand, but I imagine this proposal being aimed at strings that are a few lines long, where a heredoc would be overkill. If you're going to have two different features which do similar things, you should at least make sure they have different strengths and weaknesses.)

But you could argue that it's simply a matter of style that, unfortunately, you'll usually have to assign long strings to constants. Fine. There's still a deeper problem with this design.

You propose a pair of multi-line-only string literals. One of them supports escapes, the other doesn't; both of them avoid the need to escape quotes.

Right.

Fine. Now what if you need to disable escapes or avoid escaping quotes in a single-line string? What if your string is, say, a regular expression like `"[^"\\]*(\\.[^"\\]*)*+"`—something very short, but full of backslashes and quotes?

No problem: e.g.

         func foo( parm1:
             \@`"[^"\\]*(\\.[^"\\]*)*+”`—
            .trim(), // Add this If you wish to remove the trailing spaces and line end (String extension)
            parm2: 10,
            parm3: anotherVar)
    
( i like to use lots of vertical space)

The constructs you propose are very poorly suited for that—remember, because they're line-oriented,
they don't work well in the middle of a more complicated expression—

As described they are not intended to be there, but still they can: see function in the following example:

and they aren't built on features which generalize to act on single-line strings.

? You can perfectly well process a single data lines (single-line string) like so:
     func f()
     {
          var fooRes = foo (
                                         \@`"[^"\\]*(\\.[^"\\]*)*+”`—
                                         .trim() ).yetAnotherFunction()
    }

So now we have to invent some separate mechanism which does the same thing to single-line strings, but works in a different and incompatible way. That means we now have five ad-hoc features, each of which works differently, with no way to transport your knowledge from one of them to another:

* Single-line strings
* Disabling escapes for single-line strings
* Unescaped quotes for single-line strings
* Multi-line-only strings with unescaped quotes
* Multi-line-only strings with unescaped quotes and disabled escapes

My proposal and the other features I sketch, on the other hand, does the same things with only three features, which you can use in any combination:

* Single- or multi-line strings
* Disabling escapes for any string
* Unescaped quotes for any string

Yes, they do, but the solution you propose is stil delimiter-sensitive.
and therefore prone to data errors. You can’t get around that.

This kind of modular design, where a particular task is done in the same way throughout the language, is part of what makes a good language good.

In most cases, yes, but different methods e.g those preferred by FP programmers and e.g. those preferred bij OOP programmers. can and should happily co-exist
in Swift. In this case, this proposal of yours is not bad all. Imho, neither is mine.
They both serve different purposes and programming styles.
they can exist together, as they are on different wave lengths, so to speak.
One is then free to use what fits a certain programming inclination at best.

@Dave also
what do you think about this?
I am trying to avoid the conclusion that most Swift-evolution participants are very much FP biased.
is this the case?
(some have ventilated this before in different tunes)
That wouldn’t be too much of a problem where it not for the (hopefully wrong) impressions that:
- Functional Programmers think (like the LISP-ers did in the seventies) that they are superior and Mathematically Correct,
there is no other way, and therefore all else is hopelessly wrong and should be recklessly removed (from Swift in this case)
like removing language elements that are not in line with FP?

Consider for a moment that Swift-Evolution was OOP-dominated
and therefore happily removing closures/lambdas protocols,
because they have never used it or even do not understand it?
would you accept that?

(protocols in Swift are cool btw, as for me,
I take the best of both worlds whether FP and/or OOP)

Have a nice weekend!
TedvG

--
Brent Royal-Gordon
Architechies


(Dave Abrahams) #4

@Dave also
what do you think about this?
I am trying to avoid the conclusion that most Swift-evolution
participants are very much FP biased. is this the case?

I don't believe so.

(some have ventilated this before in different tunes)
That wouldn’t be too much of a problem where it not for the (hopefully
wrong) impressions that:

- Functional Programmers think (like the LISP-ers did in the
seventies) that they are superior and Mathematically Correct, there is
no other way, and therefore all else is hopelessly wrong and should be
recklessly removed (from Swift in this case) like removing language
elements that are not in line with FP?

The things we've removed from the language, and the reasons we've
removed them, don't have anything to do with FP. Personally I'm
ambivalent about many of the removals, but as I understand the rationale it is
mostly about eliminating some error-prone `C' heritage, and that makes
sense to me.

Consider for a moment that Swift-Evolution was OOP-dominated
and therefore happily removing closures/lambdas protocols,
because they have never used it or even do not understand it?
would you accept that?

(protocols in Swift are cool btw, as for me,
I take the best of both worlds whether FP and/or OOP)

Protocols have nothing in particular to do with FP either.

···

on Sat Apr 30 2016, "Ted F.A. van Gaalen via swift-evolution" <swift-evolution@swift.org> wrote:

--
Dave