Speaking only for myself here:
Heredocs are basically an alternate multiline syntax. There are a few of these we could use—for instance, we could allow you to put more than three
" characters at the beginning and then match that number at the end—and they all share similar flaws.
Let's think about the problems in a string that might make you want a raw syntax, and see how well a second multiline syntax would help with those:
Your string contains backslashes: An alternate multiline syntax wouldn't handle backslashes any differently, so it wouldn't help with these.
Your single-line string contains
" characters: Transforming it into a multiline string is kind of disruptive to your code, especially if the string is short—you might turn a one-line expression into a three-line expression for a ten-character literal. But even if you were willing to do that, well, you could just use the existing multiline string literal syntax for that!
Your multiline string contains
""" sequences: Sure, it would be useful here. But that basically means the second multiline syntax is only necessary when you're generating Swift or Python code which itself contains multiline string literals. That's a bit niche.
Your single-line string contains
""" sequences: That's really niche, and it falls into the same "three lines of code for a ten-character literal" problem mentioned earlier.
So a second multiline syntax doesn't cover many of the use cases we care about. At the same time, it also redundantly covers many use cases covered by our existing multiline syntax; that's not great because it causes confusion about which one a user should choose. The few use cases it really does help with—strings containing
"""—are probably not worth adding such a large feature if it doesn't help us with anything else.
Beyond the general problems with a second multiline syntax, heredocs tend to be more difficult for tools—particularly multi-language syntax highlighters which don't integrate with their compilers—to handle correctly than other multiline syntaxes. But that's not the biggest problem; the generic problems with second multiline syntaxes are.
Here are the reasons for that decision (from my perspective):
I don't think (e.g.)
3# will stand out as prominently in a noisy string as
The leading delimiter would be
#3" and the trailing delimiter would be
"3#. Would a backslash escape be
\3#? Will people remember which one it is?
By that standard, every raw string feature in every language you've ever used is a lie, because you always need to make sure the delimiter is not present in the string. For example, if we had "truly raw" strings delimited by
', you would need to make sure your string didn't contain any single quotes. If they were delimited by an arbitrary user-specified string, you would have to generate delimiters and check them against the string until you found one that wasn't present. Short of
__DATA__ or a length-prefixed literal format, all raw strings are a fiction.
The difference between this proposal and a "true raw string" proposal is that, in addition to checking for
"# before using
#", you also need to check for
\#. If you're generating code and you don't care if it uses slightly more escaping than necessary, you can just check for
#. If not, checking for two sequences is not much more burdensome than checking for one, and I think that the extra feature set it unlocks is worth that cost.