[Accepted] SE-0168: Multi-Line String Literals


#1

Xiaodi, I think one thing you're neglecting is that users may never print out a multiline literal string at all. A string might never be printed or read by a human outside of the code it resides in. In this case it seems perfectly reasonable to ask that it be possible to format the string nicely in the code and disregard how it would actually be printed.

Even if we intended to print every string we used, I don't agree that a string's representation in code should be coupled to its appearance when printed. That seems like a needless restriction to impose on the language. The whole point of multiline strings is to be able to visually lay out strings as desired, independent of the editor. Allowing manual line breaks without introducing a newline is one more step toward completing this goal.

To respond to your specific question as to why soft wrap is insufficient: it either looks bad because the wrapped text is unindented, or it introduces ambiguity by indenting the wrapped text (is that a wrapped line or two separate lines?). As Adrian Zubarev pointed out, we wouldn't need multiline string literals at all if we were content with manually inserting "\n" in multline strings and living with the soft wrapping. Evidently we are not content with that, so neither should we be content with having no way to break up long lines containing no newline character.

Finally, I see no reason why we should be fighting against this. It only makes multiline strings more capable. If you don't want to use manual wrapping of lines, you don't have to.

Regards,
Robert


(Xiaodi Wu) #2

Xiaodi, I think one thing you're neglecting is that users may never print
out a multiline literal string at all. A string might never be printed or
read by a human outside of the code it resides in. In this case it seems
perfectly reasonable to ask that it be possible to format the string nicely
in the code and disregard how it would actually be printed.

Can you give an example of such a use case, where a string is never seen by
a human but one cannot insert literal newlines and would need elided ones
instead?

Even if we intended to print every string we used, I don't agree that a

string's representation in code should be coupled to its appearance when
printed. That seems like a needless restriction to impose on the language.

I disagree that it is needless. To me, it is a sine qua non and raison
d'être of literals. It is, after all, what the word "literal" means.

The whole point of multiline strings is to be able to visually lay out
strings as desired, independent of the editor.

I don't believe that's correct. The whole point of multiline strings it to
be able to use literal newlines, no more and no less. I disagree strongly
with the idea that a literal should support "laying out strings as
desired"--only representing strings literally.

Allowing manual line breaks without introducing a newline is one more step

toward completing this goal.

To respond to your specific question as to why soft wrap is insufficient:
it either looks bad because the wrapped text is unindented, or it
introduces ambiguity by indenting the wrapped text (is that a wrapped line
or two separate lines?). As Adrian Zubarev pointed out, we wouldn't need
multiline string literals at all if we were content with manually inserting
"\n" in multline strings and living with the soft wrapping.

Again, I understand the whole point of multiline string literals to be
allowing the use of literal newlines. The part about not liking
soft-wrapping, afaict, was never discussed during review as a motivation. I
personally do not use soft wrapping, but I have no problem with other
people using it. Besides that, there's the option of not wrapping (my
personal choice) and the option of hard wrapping using concatenation.

Evidently we are not content with that, so neither should we be content

with having no way to break up long lines containing no newline character.

Finally, I see no reason why we should be fighting against this. It only
makes multiline strings more capable. If you don't want to use manual
wrapping of lines, you don't have to.

This, IMO, is the wrong kind of reasoning. All additive features can be
summed up as "you don't have to use it if you don't like it." But that is
not the bar for including a feature. Each addition makes the language more
difficult to master and has the potential to make the resulting code more
difficult to read and understand. On balance, would this particular feature
hold its own weight? I don't think that avoiding the "it looks bad" issue
with soft-wrapping warrants such an addition.

···

On Fri, Apr 21, 2017 at 8:48 AM, Robert Bennett via swift-evolution < swift-evolution@swift.org> wrote:


(Erica Sadun) #3

The most common reason is that the code is maintained by a (non-human) developer, who wants to be able to see and update the code in a readable form, but that represents a single line that will automatically wrapped by, for example, a UITextView for (human) consumption.

-- E

···

On Apr 21, 2017, at 12:40 PM, Xiaodi Wu via swift-evolution <swift-evolution@swift.org> wrote:

On Fri, Apr 21, 2017 at 8:48 AM, Robert Bennett via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
Xiaodi, I think one thing you're neglecting is that users may never print out a multiline literal string at all. A string might never be printed or read by a human outside of the code it resides in. In this case it seems perfectly reasonable to ask that it be possible to format the string nicely in the code and disregard how it would actually be printed.

Can you give an example of such a use case, where a string is never seen by a human but one cannot insert literal newlines and would need elided ones instead?


(Xiaodi Wu) #4

A different scenario from what Robert's describing, but sure. This goes to
my question to David Hart. Isn't this an argument for a feature to allow
breaking a single-line string literal across multiple lines? What makes
this a use case for some feature for _multiline_ string literals in
particular?

···

On Fri, Apr 21, 2017 at 1:45 PM, Erica Sadun <erica@ericasadun.com> wrote:

On Apr 21, 2017, at 12:40 PM, Xiaodi Wu via swift-evolution < > swift-evolution@swift.org> wrote:

On Fri, Apr 21, 2017 at 8:48 AM, Robert Bennett via swift-evolution < > swift-evolution@swift.org> wrote:

Xiaodi, I think one thing you're neglecting is that users may never print
out a multiline literal string at all. A string might never be printed or
read by a human outside of the code it resides in. In this case it seems
perfectly reasonable to ask that it be possible to format the string nicely
in the code and disregard how it would actually be printed.

Can you give an example of such a use case, where a string is never seen
by a human but one cannot insert literal newlines and would need elided
ones instead?

The most common reason is that the code is maintained by a (non-human)
developer, who wants to be able to see and update the code in a readable
form, but that represents a single line that will automatically wrapped by,
for example, a UITextView for (human) consumption.


#5

It looks like we have different interpretations about what "literal" means (I think this may have been brought up in earlier messages in this thread; I don't remember the resolution). I interpreted it as meaning the same thing as literal in *LiteralConvertible, i.e., a Swift type that is written out in source. Multiline string literal would then refer to a multiline "piece of source code representing a String". It sounds like you are taking "literal" to mean something that *literally* (to the extent possible) represents its data, which in the case of String would mean writing out the source code exactly as the resulting String will appear. Up until now I think those two interpretations of "literal" were equivalent. For no type other than String is the physical layout in source related to the underlying data, and prior to this proposal the point was moot for String because there was only a single allowable layout (barring concatenation with +, which utilizes multiple independent literals).

So, my view of the goal of this proposal is to allow writing a StringLiteralType across multiple lines. It appears your view is to allow using multiple lines to *literally* represent a String's content.

Again, for editors that indent wrapped lines, disallowing manually breaking lines will actually introduce ambiguity into the multiline string, which runs counter to the goal of the proposal. Also, from an ideological standpoint I see no reason to disallow this feature.

-- Robert

···

On Apr 21, 2017, at 2:48 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Fri, Apr 21, 2017 at 1:45 PM, Erica Sadun <erica@ericasadun.com> wrote:

On Apr 21, 2017, at 12:40 PM, Xiaodi Wu via swift-evolution <swift-evolution@swift.org> wrote:

On Fri, Apr 21, 2017 at 8:48 AM, Robert Bennett via swift-evolution <swift-evolution@swift.org> wrote:
Xiaodi, I think one thing you're neglecting is that users may never print out a multiline literal string at all. A string might never be printed or read by a human outside of the code it resides in. In this case it seems perfectly reasonable to ask that it be possible to format the string nicely in the code and disregard how it would actually be printed.

Can you give an example of such a use case, where a string is never seen by a human but one cannot insert literal newlines and would need elided ones instead?

The most common reason is that the code is maintained by a (non-human) developer, who wants to be able to see and update the code in a readable form, but that represents a single line that will automatically wrapped by, for example, a UITextView for (human) consumption.

A different scenario from what Robert's describing, but sure. This goes to my question to David Hart. Isn't this an argument for a feature to allow breaking a single-line string literal across multiple lines? What makes this a use case for some feature for _multiline_ string literals in particular?


(Brent Royal-Gordon) #6

Well, if you're breaking a string across several lines, you will want indentation stripping too. Are you suggesting we should also bring that feature to single-line string literals with escaped newlines?

···

On Apr 21, 2017, at 11:48 AM, Xiaodi Wu via swift-evolution <swift-evolution@swift.org> wrote:

This goes to my question to David Hart. Isn't this an argument for a feature to allow breaking a single-line string literal across multiple lines? What makes this a use case for some feature for _multiline_ string literals in particular?

--
Brent Royal-Gordon
Architechies


(Thorsten Seitz) #7

I think „single-line“ and „multiline“ should foremost apply to the code representation of a string and not its result.
Otherwise "foo\nbar“ would be a multiline string with your reasoning, wouldn’t you agree?

Therefore a multiline string is one which is written over several lines of *code* to make maintenance easier.
From that follows naturally that as soon as line breaks are introduced for hard wrapping we are talking about multiline strings.

In addition as soon as line breaks are introduced in the code the question of indentation arises which is solved neatly with the multiline string proposal by the position of the ending delimiter which is not possible with single-line strings.

-Thorsten

···

Am 21.04.2017 um 20:48 schrieb Xiaodi Wu via swift-evolution <swift-evolution@swift.org>:

On Fri, Apr 21, 2017 at 1:45 PM, Erica Sadun <erica@ericasadun.com <mailto:erica@ericasadun.com>> wrote:

On Apr 21, 2017, at 12:40 PM, Xiaodi Wu via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

On Fri, Apr 21, 2017 at 8:48 AM, Robert Bennett via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
Xiaodi, I think one thing you're neglecting is that users may never print out a multiline literal string at all. A string might never be printed or read by a human outside of the code it resides in. In this case it seems perfectly reasonable to ask that it be possible to format the string nicely in the code and disregard how it would actually be printed.

Can you give an example of such a use case, where a string is never seen by a human but one cannot insert literal newlines and would need elided ones instead?

The most common reason is that the code is maintained by a (non-human) developer, who wants to be able to see and update the code in a readable form, but that represents a single line that will automatically wrapped by, for example, a UITextView for (human) consumption.

A different scenario from what Robert's describing, but sure. This goes to my question to David Hart. Isn't this an argument for a feature to allow breaking a single-line string literal across multiple lines? What makes this a use case for some feature for _multiline_ string literals in particular?


(Xiaodi Wu) #8

It looks like we have different interpretations about what "literal" means
(I think this may have been brought up in earlier messages in this thread;
I don't remember the resolution). I interpreted it as meaning the same
thing as literal in *LiteralConvertible, i.e., a Swift type that is written
out in source. Multiline string literal would then refer to a multiline
"piece of source code representing a String". It sounds like you are taking
"literal" to mean something that *literally* (to the extent possible)
represents its data, which in the case of String would mean writing out the
source code exactly as the resulting String will appear. Up until now I
think those two interpretations of "literal" were equivalent. For no type
other than String is the physical layout in source related to the
underlying data, and prior to this proposal the point was moot for String
because there was only a single allowable layout (barring concatenation
with +, which utilizes multiple independent literals).

So, my view of the goal of this proposal is to allow writing a
StringLiteralType across multiple lines. It appears your view is to allow
using multiple lines to *literally* represent a String's content.

Yes, indeed, a very good summary. I think, if I read between the lines
correctly, the core team is intentionally making sure that these two views
of a literal will remain equivalent, based on which subset of the current
proposal has been accepted and which has been stricken. I'm arguing here
that we should continue down that road.

Again, for editors that indent wrapped lines, disallowing manually breaking

lines will actually introduce ambiguity into the multiline string, which
runs counter to the goal of the proposal. Also, from an ideological
standpoint I see no reason to disallow this feature.

I'm not supremely opposed to it either. The point of this exercise is to
tease out what exactly is accomplished by adding it and whether it's worth
the implementation effort and additional complexity in features. Surely,
there is more to be said for it than merely accommodating people who don't
like the look of soft-wrapped text.

The other point of the exercise is to discuss how it impacts our
conceptions of what a literal is and what it ought to be.

On additional aim here is to drive home the point that escaping line breaks
is not an issue inherent to _multiline_ strings, but even single-line
strings might need to be broken up into multiple lines of source code. So,
if we agree that escaping newlines is a feature that we want to have, a
design that addresses """this""" but not "this" is incomplete and
needlessly restrictive.

One final aim is to argue that whether or not this feature is added, it is
an orthogonal question to that of trailing whitespace at the end of lines
in a multiline literal. For the moment, that is the most salient part of
the conversation, given the thread that we're in.

-- Robert

···

On Fri, Apr 21, 2017 at 2:42 PM, Robert Bennett <rltbennett@icloud.com> wrote:

On Apr 21, 2017, at 2:48 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Fri, Apr 21, 2017 at 1:45 PM, Erica Sadun <erica@ericasadun.com> wrote:

On Apr 21, 2017, at 12:40 PM, Xiaodi Wu via swift-evolution < >> swift-evolution@swift.org> wrote:

On Fri, Apr 21, 2017 at 8:48 AM, Robert Bennett via swift-evolution < >> swift-evolution@swift.org> wrote:

Xiaodi, I think one thing you're neglecting is that users may never
print out a multiline literal string at all. A string might never be
printed or read by a human outside of the code it resides in. In this case
it seems perfectly reasonable to ask that it be possible to format the
string nicely in the code and disregard how it would actually be printed.

Can you give an example of such a use case, where a string is never seen
by a human but one cannot insert literal newlines and would need elided
ones instead?

The most common reason is that the code is maintained by a (non-human)
developer, who wants to be able to see and update the code in a readable
form, but that represents a single line that will automatically wrapped by,
for example, a UITextView for (human) consumption.

A different scenario from what Robert's describing, but sure. This goes to
my question to David Hart. Isn't this an argument for a feature to allow
breaking a single-line string literal across multiple lines? What makes
this a use case for some feature for _multiline_ string literals in
particular?


(Xiaodi Wu) #9

Xiaodi, I think one thing you're neglecting is that users may never
print out a multiline literal string at all. A string might never be
printed or read by a human outside of the code it resides in. In this case
it seems perfectly reasonable to ask that it be possible to format the
string nicely in the code and disregard how it would actually be printed.

Can you give an example of such a use case, where a string is never seen
by a human but one cannot insert literal newlines and would need elided
ones instead?

The most common reason is that the code is maintained by a (non-human)
developer, who wants to be able to see and update the code in a readable
form, but that represents a single line that will automatically wrapped by,
for example, a UITextView for (human) consumption.

A different scenario from what Robert's describing, but sure. This goes to
my question to David Hart. Isn't this an argument for a feature to allow
breaking a single-line string literal across multiple lines? What makes
this a use case for some feature for _multiline_ string literals in
particular?

I think „single-line“ and „multiline“ should foremost apply to the code
representation of a string and not its result.
Otherwise "foo\nbar“ would be a multiline string with your reasoning,
wouldn’t you agree?

I think Robert Bennett has summarized the differences between his view and
my view of literals very well. Keep in mind we're talking about _string
literals_, not strings. To me, a literal is something that represents its
data as literally as possible. Therefore, what makes something a
_multiline_ string literal is simple: it permits literal newlines.

···

On Sat, Apr 22, 2017 at 3:21 AM, Thorsten Seitz <tseitz42@icloud.com> wrote:

Am 21.04.2017 um 20:48 schrieb Xiaodi Wu via swift-evolution < > swift-evolution@swift.org>:
On Fri, Apr 21, 2017 at 1:45 PM, Erica Sadun <erica@ericasadun.com> wrote:

On Apr 21, 2017, at 12:40 PM, Xiaodi Wu via swift-evolution < >> swift-evolution@swift.org> wrote:
On Fri, Apr 21, 2017 at 8:48 AM, Robert Bennett via swift-evolution < >> swift-evolution@swift.org> wrote:

Therefore a multiline string is one which is written over several lines of
*code* to make maintenance easier.
From that follows naturally that as soon as line breaks are introduced for
hard wrapping we are talking about multiline strings.

In addition as soon as line breaks are introduced in the code the question
of indentation arises which is solved neatly with the multiline string
proposal by the position of the ending delimiter which is not possible with
single-line strings.

-Thorsten


(Xiaodi Wu) #10

This goes to my question to David Hart. Isn't this an argument for a
feature to allow breaking a single-line string literal across multiple
lines? What makes this a use case for some feature for _multiline_ string
literals in particular?

Well, if you're breaking a string across several lines, you will want
indentation stripping too. Are you suggesting we should also bring that
feature to single-line string literals with escaped newlines?

No, I am suggesting that whatever design is used for escaped newlines, if
at all possible it should be equally apt for "strings" and """strings"""
such that it will not require indentation stripping.

···

On Sat, Apr 22, 2017 at 3:38 AM, Brent Royal-Gordon <brent@architechies.com> wrote:

On Apr 21, 2017, at 11:48 AM, Xiaodi Wu via swift-evolution < > swift-evolution@swift.org> wrote:

--
Brent Royal-Gordon
Architechies


(David Hart) #11

This goes to my question to David Hart. Isn't this an argument for a feature to allow breaking a single-line string literal across multiple lines? What makes this a use case for some feature for _multiline_ string literals in particular?

Well, if you're breaking a string across several lines, you will want indentation stripping too. Are you suggesting we should also bring that feature to single-line string literals with escaped newlines?

I would say no to keep single-quotes String literals simple and to better distinguish between single and multiline syntaxes. Otherwise, the only difference between the two would be that one allows single and double quotes without escaping.

Therefore, we should try to make triple-quotes String literals support single-line-result strings split over multiple lines of code.

···

On 22 Apr 2017, at 10:38, Brent Royal-Gordon via swift-evolution <swift-evolution@swift.org> wrote:

On Apr 21, 2017, at 11:48 AM, Xiaodi Wu via swift-evolution <swift-evolution@swift.org> wrote:

--
Brent Royal-Gordon
Architechies

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(David Hart) #12

Xiaodi, I think one thing you're neglecting is that users may never print out a multiline literal string at all. A string might never be printed or read by a human outside of the code it resides in. In this case it seems perfectly reasonable to ask that it be possible to format the string nicely in the code and disregard how it would actually be printed.

Can you give an example of such a use case, where a string is never seen by a human but one cannot insert literal newlines and would need elided ones instead?

The most common reason is that the code is maintained by a (non-human) developer, who wants to be able to see and update the code in a readable form, but that represents a single line that will automatically wrapped by, for example, a UITextView for (human) consumption.

A different scenario from what Robert's describing, but sure. This goes to my question to David Hart. Isn't this an argument for a feature to allow breaking a single-line string literal across multiple lines? What makes this a use case for some feature for _multiline_ string literals in particular?

I think „single-line“ and „multiline“ should foremost apply to the code representation of a string and not its result.
Otherwise "foo\nbar“ would be a multiline string with your reasoning, wouldn’t you agree?

Therefore a multiline string is one which is written over several lines of *code* to make maintenance easier.
From that follows naturally that as soon as line breaks are introduced for hard wrapping we are talking about multiline strings.

In addition as soon as line breaks are introduced in the code the question of indentation arises which is solved neatly with the multiline string proposal by the position of the ending delimiter which is not possible with single-line strings.

+1 to this whole message

···

On 22 Apr 2017, at 10:21, Thorsten Seitz via swift-evolution <swift-evolution@swift.org> wrote:

Am 21.04.2017 um 20:48 schrieb Xiaodi Wu via swift-evolution <swift-evolution@swift.org>:
On Fri, Apr 21, 2017 at 1:45 PM, Erica Sadun <erica@ericasadun.com> wrote:

On Apr 21, 2017, at 12:40 PM, Xiaodi Wu via swift-evolution <swift-evolution@swift.org> wrote:

On Fri, Apr 21, 2017 at 8:48 AM, Robert Bennett via swift-evolution <swift-evolution@swift.org> wrote:

-Thorsten

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Thorsten Seitz) #13

Xiaodi, I think one thing you're neglecting is that users may never print out a multiline literal string at all. A string might never be printed or read by a human outside of the code it resides in. In this case it seems perfectly reasonable to ask that it be possible to format the string nicely in the code and disregard how it would actually be printed.

Can you give an example of such a use case, where a string is never seen by a human but one cannot insert literal newlines and would need elided ones instead?

The most common reason is that the code is maintained by a (non-human) developer, who wants to be able to see and update the code in a readable form, but that represents a single line that will automatically wrapped by, for example, a UITextView for (human) consumption.

A different scenario from what Robert's describing, but sure. This goes to my question to David Hart. Isn't this an argument for a feature to allow breaking a single-line string literal across multiple lines? What makes this a use case for some feature for _multiline_ string literals in particular?

I think „single-line“ and „multiline“ should foremost apply to the code representation of a string and not its result.
Otherwise "foo\nbar“ would be a multiline string with your reasoning, wouldn’t you agree?

I think Robert Bennett has summarized the differences between his view and my view of literals very well. Keep in mind we're talking about _string literals_, not strings. To me, a literal is something that represents its data as literally as possible. Therefore, what makes something a _multiline_ string literal is simple: it permits literal newlines.

I know that we are talking about string literals, not strings, that’s why I talked about _representations_ vs. _results_.

A literal is a _textual representation_ of data as opposed to a calculation. Requiring for the special case of _string_ literals that the textual representation has to be as literal as possible is an artificial restriction that I do not share.
Following your argument you would have to prohibit `\t` and `\n` from multiline strings and `\t` from single-line strings because these characters can be written literally and therefore should (to be represented "as literally as possible“).

So, to reiterate, the valuable distinction between single-line and multiline string literals is their textual representation and *not* whether the result of that has one or more lines. Multiline strings allow easier maintenance of _long_ strings (which may or may not have multiple lines).

And just as `\n` is allowed in single-line string literals resulting in a multiline string it would make sense to allow `\` in multiline string literals to suppress literal newlines.
The result would be
(a) single-line string literals which are always written in a single line but can represent single-line or multiline strings (just as today, using `\n`) and
(b) multiline string literals which are always written in multiple lines but can equally represent single-line (not possible today) or multiline strings (using literal newlines or `\n`).

A notable difference between single-line string literals and multiline string literals is that whitespace is not visible anymore at the end of each line (as has been pointed out several times by Adrian). Within a single-line string literal whitespace is visible everywhere within the string. Within a multiline string literal whitespace is visible at the beginning of each line due to the position of the closing delimiter and the corresponding indentation suppression, it is visible within each line but it is *not* visible at the end of a line. That would be fixed by suppressing trailing whitespace and only allowing it before a `\`.
This is similar to the normalization of the literal newline to `\n`.

Pointing to tools or editor features to strip trailing whitespace is simply *wrong* IMHO because the idea of stripping trailing whitespace by editors or tools is only intended for *non-relevant* whitespace!
How often have you worked in teams where each developer has different settings for his editor (if they even use the same editor)? Relying on this will lead to stripping of relevant whitespace because someone has opened the file with the stripping setting on in his editor — and the problem is that this change is not even visible.

In another mail you pointed out that Unicode makes relying on visible characters difficult anyway. While that is true, I think there is a significant difference between characters that I _see_ which might not have the correct character code and characters that I _do not even see_.

-Thorsten

···

Am 22.04.2017 um 17:08 schrieb Xiaodi Wu <xiaodi.wu@gmail.com>:
On Sat, Apr 22, 2017 at 3:21 AM, Thorsten Seitz <tseitz42@icloud.com <mailto:tseitz42@icloud.com>> wrote:

Am 21.04.2017 um 20:48 schrieb Xiaodi Wu via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>>:
On Fri, Apr 21, 2017 at 1:45 PM, Erica Sadun <erica@ericasadun.com <mailto:erica@ericasadun.com>> wrote:

On Apr 21, 2017, at 12:40 PM, Xiaodi Wu via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
On Fri, Apr 21, 2017 at 8:48 AM, Robert Bennett via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

Therefore a multiline string is one which is written over several lines of *code* to make maintenance easier.
From that follows naturally that as soon as line breaks are introduced for hard wrapping we are talking about multiline strings.

In addition as soon as line breaks are introduced in the code the question of indentation arises which is solved neatly with the multiline string proposal by the position of the ending delimiter which is not possible with single-line strings.

-Thorsten


#14

I'm not sure how we could implement breaking lines with \ for single line strings. Either indentation has to be stripped from the broken line, or the line must not be indented in which case nothing has been gained because soft wrap would accomplish the same thing. (Is there an option I'm missing?)

Now that we have """strings""", we could simply say: if you want to break a string over multiple lines, use a """string""" as a "string" does not permit this.

···

On Apr 22, 2017, at 11:12 AM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Sat, Apr 22, 2017 at 3:38 AM, Brent Royal-Gordon <brent@architechies.com> wrote:

On Apr 21, 2017, at 11:48 AM, Xiaodi Wu via swift-evolution <swift-evolution@swift.org> wrote:

This goes to my question to David Hart. Isn't this an argument for a feature to allow breaking a single-line string literal across multiple lines? What makes this a use case for some feature for _multiline_ string literals in particular?

Well, if you're breaking a string across several lines, you will want indentation stripping too. Are you suggesting we should also bring that feature to single-line string literals with escaped newlines?

No, I am suggesting that whatever design is used for escaped newlines, if at all possible it should be equally apt for "strings" and """strings""" such that it will not require indentation stripping.

--
Brent Royal-Gordon
Architechies


(Xiaodi Wu) #15

Xiaodi, I think one thing you're neglecting is that users may never
print out a multiline literal string at all. A string might never be
printed or read by a human outside of the code it resides in. In this case
it seems perfectly reasonable to ask that it be possible to format the
string nicely in the code and disregard how it would actually be printed.

Can you give an example of such a use case, where a string is never seen
by a human but one cannot insert literal newlines and would need elided
ones instead?

The most common reason is that the code is maintained by a (non-human)
developer, who wants to be able to see and update the code in a readable
form, but that represents a single line that will automatically wrapped by,
for example, a UITextView for (human) consumption.

A different scenario from what Robert's describing, but sure. This goes
to my question to David Hart. Isn't this an argument for a feature to allow
breaking a single-line string literal across multiple lines? What makes
this a use case for some feature for _multiline_ string literals in
particular?

I think „single-line“ and „multiline“ should foremost apply to the code
representation of a string and not its result.
Otherwise "foo\nbar“ would be a multiline string with your reasoning,
wouldn’t you agree?

I think Robert Bennett has summarized the differences between his view and
my view of literals very well. Keep in mind we're talking about _string
literals_, not strings. To me, a literal is something that represents its
data as literally as possible. Therefore, what makes something a
_multiline_ string literal is simple: it permits literal newlines.

I know that we are talking about string literals, not strings, that’s why
I talked about _representations_ vs. _results_.

A literal is a _textual representation_ of data as opposed to a
calculation. Requiring for the special case of _string_ literals that the
textual representation has to be as literal as possible is an artificial
restriction that I do not share.
Following your argument you would have to prohibit `\t` and `\n` from
multiline strings and `\t` from single-line strings because these
characters can be written literally and therefore should (to be represented
"as literally as possible“).

So, to reiterate, the valuable distinction between single-line and
multiline string literals is their textual representation and *not* whether
the result of that has one or more lines. Multiline strings allow easier
maintenance of _long_ strings (which may or may not have multiple lines).

I have never argued that the distinction between single-line and multiline
string literals is whether the resulting string has one or more lines. I
have argued that a multiline string literal is _a string literal which
permits literal newlines_.

And just as `\n` is allowed in single-line string literals resulting in a

multiline string it would make sense to allow `\` in multiline string
literals to suppress literal newlines.
The result would be
(a) single-line string literals which are always written in a single line
but can represent single-line or multiline strings (just as today, using
`\n`) and
(b) multiline string literals which are always written in multiple lines
but can equally represent single-line (not possible today) or multiline
strings (using literal newlines or `\n`).

It _is_ currently possible for a multiline string literal to represent a
single-line string:

'''
This is a single-line string using multiline string literal syntax. You can
soft-wrap this as much as you want. An intelligent editor might even indent
the soft-wrapping in a pretty but unambiguous way.
'''

A notable difference between single-line string literals and multiline

string literals is that whitespace is not visible anymore at the end of
each line (as has been pointed out several times by Adrian). Within a
single-line string literal whitespace is visible everywhere within the
string. Within a multiline string literal whitespace is visible at the
beginning of each line due to the position of the closing delimiter and the
corresponding indentation suppression, it is visible within each line but
it is *not* visible at the end of a line. That would be fixed by
suppressing trailing whitespace and only allowing it before a `\`.
This is similar to the normalization of the literal newline to `\n`.

Pointing to tools or editor features to strip trailing whitespace is
simply *wrong* IMHO because the idea of stripping trailing whitespace by
editors or tools is only intended for *non-relevant* whitespace!
How often have you worked in teams where each developer has different
settings for his editor (if they even use the same editor)? Relying on this
will lead to stripping of relevant whitespace because someone has opened
the file with the stripping setting on in his editor — and the problem is
that this change is not even visible.

This argument is problematic. Is your position that literal trailing
whitespace in a multiline string literal is *relevant* or *non-relevant*?

If literal trailing whitespace in a multiline string literal is *relevant*,
then neither should your tools strip them away nor should the compiler
suppress them.
If literal trailing whitespace in a multiline string literal is
*irrelevant*, then it does not matter whether or not your tools strip them
away and it does not matter whether or not the compiler suppresses them.

In another mail you pointed out that Unicode makes relying on visible
characters difficult anyway. While that is true, I think there is a
significant difference between characters that I _see_ which might not have
the correct character code and characters that I _do not even see_.

In Unicode, there are many, many code points you do not see. These have
large security implications. It is simply false to say that stripping
trailing whitespace will allow you to see some representation of each code
point that is present in the string. It absolutely will not. It cannot be
possible that trailing spaces are a problem for you, but trailing
non-breaking spaces or trailing half-width spaces are not a problem.

-Thorsten

···

On Sat, Apr 22, 2017 at 12:57 PM, Thorsten Seitz <tseitz42@icloud.com> wrote:

Am 22.04.2017 um 17:08 schrieb Xiaodi Wu <xiaodi.wu@gmail.com>:
On Sat, Apr 22, 2017 at 3:21 AM, Thorsten Seitz <tseitz42@icloud.com> > wrote:

Am 21.04.2017 um 20:48 schrieb Xiaodi Wu via swift-evolution < >> swift-evolution@swift.org>:
On Fri, Apr 21, 2017 at 1:45 PM, Erica Sadun <erica@ericasadun.com> >> wrote:

On Apr 21, 2017, at 12:40 PM, Xiaodi Wu via swift-evolution < >>> swift-evolution@swift.org> wrote:
On Fri, Apr 21, 2017 at 8:48 AM, Robert Bennett via swift-evolution < >>> swift-evolution@swift.org> wrote:

Therefore a multiline string is one which is written over several lines
of *code* to make maintenance easier.
From that follows naturally that as soon as line breaks are introduced
for hard wrapping we are talking about multiline strings.

In addition as soon as line breaks are introduced in the code the
question of indentation arises which is solved neatly with the multiline
string proposal by the position of the ending delimiter which is not
possible with single-line strings.

-Thorsten


(Thorsten Seitz) #16

Xiaodi, I think one thing you're neglecting is that users may never print out a multiline literal string at all. A string might never be printed or read by a human outside of the code it resides in. In this case it seems perfectly reasonable to ask that it be possible to format the string nicely in the code and disregard how it would actually be printed.

Can you give an example of such a use case, where a string is never seen by a human but one cannot insert literal newlines and would need elided ones instead?

The most common reason is that the code is maintained by a (non-human) developer, who wants to be able to see and update the code in a readable form, but that represents a single line that will automatically wrapped by, for example, a UITextView for (human) consumption.

A different scenario from what Robert's describing, but sure. This goes to my question to David Hart. Isn't this an argument for a feature to allow breaking a single-line string literal across multiple lines? What makes this a use case for some feature for _multiline_ string literals in particular?

I think „single-line“ and „multiline“ should foremost apply to the code representation of a string and not its result.
Otherwise "foo\nbar“ would be a multiline string with your reasoning, wouldn’t you agree?

I think Robert Bennett has summarized the differences between his view and my view of literals very well. Keep in mind we're talking about _string literals_, not strings. To me, a literal is something that represents its data as literally as possible. Therefore, what makes something a _multiline_ string literal is simple: it permits literal newlines.

I know that we are talking about string literals, not strings, that’s why I talked about _representations_ vs. _results_.

A literal is a _textual representation_ of data as opposed to a calculation. Requiring for the special case of _string_ literals that the textual representation has to be as literal as possible is an artificial restriction that I do not share.
Following your argument you would have to prohibit `\t` and `\n` from multiline strings and `\t` from single-line strings because these characters can be written literally and therefore should (to be represented "as literally as possible“).

So, to reiterate, the valuable distinction between single-line and multiline string literals is their textual representation and *not* whether the result of that has one or more lines. Multiline strings allow easier maintenance of _long_ strings (which may or may not have multiple lines).

I have never argued that the distinction between single-line and multiline string literals is whether the resulting string has one or more lines. I have argued that a multiline string literal is _a string literal which permits literal newlines_.

And just as `\n` is allowed in single-line string literals resulting in a multiline string it would make sense to allow `\` in multiline string literals to suppress literal newlines.
The result would be
(a) single-line string literals which are always written in a single line but can represent single-line or multiline strings (just as today, using `\n`) and
(b) multiline string literals which are always written in multiple lines but can equally represent single-line (not possible today) or multiline strings (using literal newlines or `\n`).

It _is_ currently possible for a multiline string literal to represent a single-line string:

'''
This is a single-line string using multiline string literal syntax. You can soft-wrap this as much as you want. An intelligent editor might even indent the soft-wrapping in a pretty but unambiguous way.
'‘'

You seem to be choosing to deliberately ignore what I (and others) write, so I will tell you a last time that hard wrapping is important to maintenance of long literal strings, because soft-wrapping is no solution due to indentation and no, an intelligent editor is not a solution either because code is not only viewed in an intelligent editor.

A notable difference between single-line string literals and multiline string literals is that whitespace is not visible anymore at the end of each line (as has been pointed out several times by Adrian). Within a single-line string literal whitespace is visible everywhere within the string. Within a multiline string literal whitespace is visible at the beginning of each line due to the position of the closing delimiter and the corresponding indentation suppression, it is visible within each line but it is *not* visible at the end of a line. That would be fixed by suppressing trailing whitespace and only allowing it before a `\`.
This is similar to the normalization of the literal newline to `\n`.

Pointing to tools or editor features to strip trailing whitespace is simply *wrong* IMHO because the idea of stripping trailing whitespace by editors or tools is only intended for *non-relevant* whitespace!
How often have you worked in teams where each developer has different settings for his editor (if they even use the same editor)? Relying on this will lead to stripping of relevant whitespace because someone has opened the file with the stripping setting on in his editor — and the problem is that this change is not even visible.

This argument is problematic. Is your position that literal trailing whitespace in a multiline string literal is *relevant* or *non-relevant*?

Relevant, of course. That is why it is to be protected from tools and to be made visible.

If literal trailing whitespace in a multiline string literal is *relevant*, then neither should your tools strip them away nor should the compiler suppress them.

Yep, but tools commonly *do* offer to strip trailing whitespace because — up to now — trailing whitespace was irrelevant to code for all languages that I know.
That is exactly why relevant trailing whitespace has to be protected by a backslash like I explained at length. Effectively the trailing backslash marks which trailing whitespace is relevant by making it non-trailing :slight_smile:

There is no way around it: tools expect trailing whitespace to be irrelevant.

If literal trailing whitespace in a multiline string literal is *irrelevant*, then it does not matter whether or not your tools strip them away and it does not matter whether or not the compiler suppresses them.

In another mail you pointed out that Unicode makes relying on visible characters difficult anyway. While that is true, I think there is a significant difference between characters that I _see_ which might not have the correct character code and characters that I _do not even see_.

In Unicode, there are many, many code points you do not see. These have large security implications. It is simply false to say that stripping trailing whitespace will allow you to see some representation of each code point that is present in the string. It absolutely will not. It cannot be possible that trailing spaces are a problem for you,

Yes it can. Curiously I didn’t have the need for non-breaking spaces or half-width spaces yet, but do commonly make use of spaces and tabs.

but trailing non-breaking spaces or trailing half-width spaces are not a problem.

The same problem and solution applies to trailing non-breaking spaces and half-width spaces. I might not easily see whether a space is breaking or non-breaking but at least I will see that there is one in the first place.

-Thorsten

···

Am 22.04.2017 um 21:27 schrieb Xiaodi Wu <xiaodi.wu@gmail.com>:
On Sat, Apr 22, 2017 at 12:57 PM, Thorsten Seitz <tseitz42@icloud.com <mailto:tseitz42@icloud.com>> wrote:

Am 22.04.2017 um 17:08 schrieb Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>>:
On Sat, Apr 22, 2017 at 3:21 AM, Thorsten Seitz <tseitz42@icloud.com <mailto:tseitz42@icloud.com>> wrote:

Am 21.04.2017 um 20:48 schrieb Xiaodi Wu via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>>:
On Fri, Apr 21, 2017 at 1:45 PM, Erica Sadun <erica@ericasadun.com <mailto:erica@ericasadun.com>> wrote:

On Apr 21, 2017, at 12:40 PM, Xiaodi Wu via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
On Fri, Apr 21, 2017 at 8:48 AM, Robert Bennett via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

-Thorsten

Therefore a multiline string is one which is written over several lines of *code* to make maintenance easier.
From that follows naturally that as soon as line breaks are introduced for hard wrapping we are talking about multiline strings.

In addition as soon as line breaks are introduced in the code the question of indentation arises which is solved neatly with the multiline string proposal by the position of the ending delimiter which is not possible with single-line strings.

-Thorsten


(Brent Royal-Gordon) #17

Could you share an example of such a design? It doesn't have to be something you'd be happy to have in the language; it just needs to fit the following criteria:

* Permits non-significant hard-wrapping in a string literal.

* Works equally well with single and triple string literals.

* Preserves code indentation, but does not require single string literals to do indentation stripping.

* Is not horribly inconvenient.

···

On Apr 22, 2017, at 8:12 AM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Sat, Apr 22, 2017 at 3:38 AM, Brent Royal-Gordon <brent@architechies.com <mailto:brent@architechies.com>> wrote:

On Apr 21, 2017, at 11:48 AM, Xiaodi Wu via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

This goes to my question to David Hart. Isn't this an argument for a feature to allow breaking a single-line string literal across multiple lines? What makes this a use case for some feature for _multiline_ string literals in particular?

Well, if you're breaking a string across several lines, you will want indentation stripping too. Are you suggesting we should also bring that feature to single-line string literals with escaped newlines?

No, I am suggesting that whatever design is used for escaped newlines, if at all possible it should be equally apt for "strings" and """strings""" such that it will not require indentation stripping.

--
Brent Royal-Gordon
Architechies


(Xiaodi Wu) #18

I'm not sure how we could implement breaking lines with \ for single line
strings. Either indentation has to be stripped from the broken line, or the
line must not be indented in which case nothing has been gained because
soft wrap would accomplish the same thing. (Is there an option I'm missing?)

Hence, perhaps, we could consider designs that involve breaking lines not a
syntax other than `\`?

···

On Sat, Apr 22, 2017 at 10:35 AM, Robert Bennett <rltbennett@icloud.com> wrote:

Now that we have """strings""", we could simply say: if you want to break
a string over multiple lines, use a """string""" as a "string" does not
permit this.

On Apr 22, 2017, at 11:12 AM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Sat, Apr 22, 2017 at 3:38 AM, Brent Royal-Gordon < > brent@architechies.com> wrote:

On Apr 21, 2017, at 11:48 AM, Xiaodi Wu via swift-evolution < >> swift-evolution@swift.org> wrote:

This goes to my question to David Hart. Isn't this an argument for a
feature to allow breaking a single-line string literal across multiple
lines? What makes this a use case for some feature for _multiline_ string
literals in particular?

Well, if you're breaking a string across several lines, you will want
indentation stripping too. Are you suggesting we should also bring that
feature to single-line string literals with escaped newlines?

No, I am suggesting that whatever design is used for escaped newlines, if
at all possible it should be equally apt for "strings" and """strings"""
such that it will not require indentation stripping.

--
Brent Royal-Gordon
Architechies


(Xiaodi Wu) #19

I'm not sure how we could implement breaking lines with \ for single line
strings. Either indentation has to be stripped from the broken line, or the
line must not be indented in which case nothing has been gained because
soft wrap would accomplish the same thing. (Is there an option I'm missing?)

Hence, perhaps, we could consider designs that involve breaking lines not
a syntax other than `\`?

s/not/with/
(Hmm, I should really make sure my sentences are grammatical. Apologies.)

Now that we have """strings""", we could simply say: if you want to break a

···

On Sat, Apr 22, 2017 at 10:42 AM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Sat, Apr 22, 2017 at 10:35 AM, Robert Bennett <rltbennett@icloud.com> > wrote:

string over multiple lines, use a """string""" as a "string" does not
permit this.

On Apr 22, 2017, at 11:12 AM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Sat, Apr 22, 2017 at 3:38 AM, Brent Royal-Gordon < >> brent@architechies.com> wrote:

On Apr 21, 2017, at 11:48 AM, Xiaodi Wu via swift-evolution < >>> swift-evolution@swift.org> wrote:

This goes to my question to David Hart. Isn't this an argument for a
feature to allow breaking a single-line string literal across multiple
lines? What makes this a use case for some feature for _multiline_ string
literals in particular?

Well, if you're breaking a string across several lines, you will want
indentation stripping too. Are you suggesting we should also bring that
feature to single-line string literals with escaped newlines?

No, I am suggesting that whatever design is used for escaped newlines, if
at all possible it should be equally apt for "strings" and """strings"""
such that it will not require indentation stripping.

--
Brent Royal-Gordon
Architechies


(Xiaodi Wu) #20

<snip>

And just as `\n` is allowed in single-line string literals resulting in a

multiline string it would make sense to allow `\` in multiline string
literals to suppress literal newlines.
The result would be
(a) single-line string literals which are always written in a single line
but can represent single-line or multiline strings (just as today, using
`\n`) and
(b) multiline string literals which are always written in multiple lines
but can equally represent single-line (not possible today) or multiline
strings (using literal newlines or `\n`).

It _is_ currently possible for a multiline string literal to represent a
single-line string:

'''
This is a single-line string using multiline string literal syntax. You
can soft-wrap this as much as you want. An intelligent editor might even
indent the soft-wrapping in a pretty but unambiguous way.
'‘'

You seem to be choosing to deliberately ignore what I (and others) write,
so I will tell you a last time that hard wrapping is important to
maintenance of long literal strings, because soft-wrapping is no solution
due to indentation and no, an intelligent editor is not a solution either
because code is not only viewed in an intelligent editor.

I don't doubt that you in particular do not like soft wrapping. But
repeating "hard wrapping is important" is not an argument; it's just an
assertion, and the salient questions are, _how_ important is it really?
what should be the design to support such a feature? are the drawbacks of
the design outweighed by the degree of importance of hard wrapping?

Put simply, I am skeptical that hard wrapping is so important that it alone
is worth a complicated new set of rules, and no one has offered evidence to
the contrary.

A notable difference between single-line string literals and multiline

string literals is that whitespace is not visible anymore at the end of
each line (as has been pointed out several times by Adrian). Within a
single-line string literal whitespace is visible everywhere within the
string. Within a multiline string literal whitespace is visible at the
beginning of each line due to the position of the closing delimiter and the
corresponding indentation suppression, it is visible within each line but
it is *not* visible at the end of a line. That would be fixed by
suppressing trailing whitespace and only allowing it before a `\`.
This is similar to the normalization of the literal newline to `\n`.

Pointing to tools or editor features to strip trailing whitespace is
simply *wrong* IMHO because the idea of stripping trailing whitespace by
editors or tools is only intended for *non-relevant* whitespace!
How often have you worked in teams where each developer has different
settings for his editor (if they even use the same editor)? Relying on this
will lead to stripping of relevant whitespace because someone has opened
the file with the stripping setting on in his editor — and the problem is
that this change is not even visible.

This argument is problematic. Is your position that literal trailing
whitespace in a multiline string literal is *relevant* or *non-relevant*?

Relevant, of course. That is why it is to be protected from tools and to
be made visible.

If literal trailing whitespace in a multiline string literal is
*relevant*, then neither should your tools strip them away nor should the
compiler suppress them.

Yep, but tools commonly *do* offer to strip trailing whitespace because —
up to now — trailing whitespace was irrelevant to code for all languages
that I know.

Huh? This proposal is inspired by Python syntax; many other languages also
have multiline string literals. ES6 in particular comes to mind. All of
these do not strip trailing whitespace. Some tools may offer to strip that
whitespace, but I don't doubt that there are many tools which leave
multiline literals alone. What is so special about Swift that this is an
issue here where it has not been elsewhere?

That is exactly why relevant trailing whitespace has to be protected by a
backslash like I explained at length. Effectively the trailing backslash
marks which trailing whitespace is relevant by making it non-trailing :slight_smile:

There is no way around it: tools expect trailing whitespace to be
irrelevant.

Again, this does not make sense. If trailing whitespace is _relevant_, then
why would tools consider them irrelevant? why would users choose to use
tools that delete things that they consider relevant instead of using or
making tools that correctly consider them relevant? Also, if you believe
them to be important, why would you support Swift prohibiting their literal
use? why wouldn't you enthusiastically support arbitrary trailing
whitespace and diligently file bugs against any tools that strip them out
of literals?

If literal trailing whitespace in a multiline string literal is
*irrelevant*, then it does not matter whether or not your tools strip them
away and it does not matter whether or not the compiler suppresses them.

In another mail you pointed out that Unicode makes relying on visible
characters difficult anyway. While that is true, I think there is a
significant difference between characters that I _see_ which might not have
the correct character code and characters that I _do not even see_.

In Unicode, there are many, many code points you do not see. These have
large security implications. It is simply false to say that stripping
trailing whitespace will allow you to see some representation of each code
point that is present in the string. It absolutely will not. It cannot be
possible that trailing spaces are a problem for you,

Yes it can. Curiously I didn’t have the need for non-breaking spaces or
half-width spaces yet, but do commonly make use of spaces and tabs.

but trailing non-breaking spaces or trailing half-width spaces are not a
problem.

The same problem and solution applies to trailing non-breaking spaces and
half-width spaces. I might not easily see whether a space is breaking or
non-breaking but at least I will see that there is one in the first place.

And what would you propose to do about invisible zero-width glyphs in
strings? There is no end to this. Unicode strings can have arbitrary
amounts of invisible data. You cannot determine the content of a string by
visual inspection. This is why I conclude that it should be an explicit
non-goal to make non-printing characters visible.

Again, if this is a common issue, one would expect pervasive tooling to
print invisible characters. Either there is such tooling, in which case the
issue is adequately addressed, or there is no such tooling, which would be
evidence to me that it is not a pervasive problem. I simply see no
evidence, given the multitude of languages with similar features, that it
is an issue that requires enforcement in the language rules.

···

On Sat, Apr 22, 2017 at 3:58 PM, Thorsten Seitz <tseitz42@icloud.com> wrote: