At this point, we have at least five credible suggestions about how to handle multiline regexes. (NB: All the below use the #/…/#
syntax for any regex literals containing newlines, so this is not a delimiter question!)
It’s probably useful to summarize them, since the discussion has become quite tangled at this point:
-
Advantage: Allows traditional extended regexes, formatted for readability
Disadvantage: May be confusing when users encounter it, requires verbose manual escaping of spaces and/or flag to disable warnings, “hello world” footgun still exists despite warning
-
Remove leading and trailing whitespace (and comments) from each line
Advantage: Somewhat intuitive behavior
Disadvantage: Harms the ability to add internal whitespace for readability
-
Advantage: Encourages people to use the regex builder DSL, which has numerous readability advantages and requires no confusing new rules about whitespace
Disadvantage: Forces people to use the regex builder DSL, which is more verbose and in some cases clumsier, and (currently) discourages named captures
-
Advantage: There is currently no other proposed facility for preserving literal newlines in a regex, which can be useful for matching large chunks of formatted text
Disadvantage: Interaction with surrounding code gets messy. (How does it handle indentation, for example? Is the rule the same as multiline strings? What are the rules for a bare leading or trailing newline? Is all this really better than explicit
\n
? etc.) -
Combine 1+4: multiline regexes are literal by default (4), but some extra syntax enables extended mode where all unescaped whitespace is ignored (1)
Advantages: Covers all the bases, more or less
Disadvantages: Maximally confusing, may not actually carry its weight
-
Use
#///…///#
as a second, separate delimiter to enable extended mode, and either (6a) disallow multiline#/…/#
or (6b) allow multiline#/…/#
and have it treat whitespace as significantAdvantages: Might mitigate the “hello world” footgun, since it’s slightly less easy to accidentally enable, and the delimiter change could help signify that the meaning of whitespace changes
Disadvantages: May be excessive and unnecessary. Option 6b poses all the problems of Option 4 above.