SE-0354: Regex Literals

I agree with the necessity to provide a terse alternative to the DSL. I'm only arguing that if the regex is terse enough to write in regex syntax format, it must be simple enough that a statically typed capture list would be of negligible use, compared to a more complex regex syntax. But a more complex regex syntax, as you've mentioned, is not going to be a good candidate for the terse syntax.

My argument is that those examples, where a simple one-liner would be better, would be just as readable with the Regex("...") syntax.

That changes my opinion to a solid +1.

3 Likes

This is great and very encouraging!

It’d be really nice to be able to re-review now with the whole picture laid out containing all of these revisions, with a focus perhaps specifically on iteratively refining the rules to make them as simple-to-understand yet as useful (both for regexes and in preserving existing uses) as possible.

9 Likes

With the change around parenthesis, yet another way to disambiguate prefix / becomes:

(/)(x)

where (/) transforms the operator to an unapplied function, and (x) calls that function. This is also backward compatible and should be applicable to even the most contrived examples. But very rarely needed I assume.

4 Likes

This is a huge improvement, thanks. I agree with @xwu that we better do a focused re-review.

Just to provide a tiny data point: I do have a few (single digit, not sure how to count as some are related) that have unbalanced ) in them.

EDIT: Let me clarify that neither of those are in Swift code and only a couple might end up in Swift code in my case.

Unbalanced unescaped closing parenthesis? The linked PR clarifies that escapes and custom character classes are taken into account.

3 Likes

How would an unbalanced, unescaped ) appear in a regex? I thought it implies any unbalanced ), which would include escaped.

Exactly!

How would unbalanced \) appear outside of a regex or string literal?

2 Likes

This is fantastic! Thank you for putting in so much effort to prevent breaking the use of / in custom operators.

3 Likes

That's not the DSL. You're changing your argument.

1 Like

My argument has been consistent across my messages. I argued that regex literals aren't worth their complexity, given a combination of a DSL and a string-based Regex parser.

I would like to hear constructive counter-arguments, please.

You wrote in your initial post:

That is not an argument that Regex plus result builders will cover uses of regex literals. That's an argument that not even Regex should be used for regex literals.

If regex literals are to be (type) checked by the compiler, it makes more sense to define a literal type than it does to embed special knowledge of a standard library type and one of its constructors (Regex and its string-literal initializer).

At the end of the day, though, whether regex literals carry their own weight is an opinion. You've stated your opinion. So far, no one agrees with you. There's no point to a drawn-out debate because we already know where the Core Team stands.

Please refer to this post: SE-0354: Regex Literals - #158 by Ben_Cohen

Making baseless assumptions about the popularity of the opinions of others and resorting to aggression due to somebody not agreeing with you is hardly going to help validate the point you're making.

In contrast to that, @Ben_Cohen has taken time to give a well-presented argument why getting rid of a terse option would be a bad idea, which I agreed with and continued the discussion by outlining another solution to the problem of not having a terse alternative. I'm not presuming that answering to my post is in his list of priorities, so I'm not taking the silence as a "yes" or "no".

Changing your opinion during a constructive discussion is a side-effect of a rational argumentation. Getting aggressive and trying to undermine opposing opinions by insulting people is destructive to a healthy community.

Could /// only form a regex when it occurs (1) inside a function body or (2) in expression position? Doc comments would not normally occur in either of those positions, I think.

I sense impending doom for the dream of making /// parallel """, but I am a foolishly optimistic person by nature and have to ask.

I wouldn't even try to support ///. When I talk about unifying string and regex I'm not chasing the bare syntax rainbow. Regex literals would be raw string only (prefixed by #) in its various variations in line with that they do not expand \ escapes or interpolations.

Yup, I have edited my post to clarify this.

4 Likes

This changes my review of bare /.../ to +1, since this removes (imo) the biggest costs associated with it, and I think the remaining costs are worth it.

I agree that this may benefit from some form of a new review; with so much of this thread focused on bare /.../ most other discussions get lost. But I don't have anything to add myself so I don't have a strong opinion.

The embedded into the DSL is the only usecase that makes sense to me because the bare regex literal occupies a whole line. We already elide commas in DSLs why not also elide #’s. It’s contextual and easy to explain and it’s web searchable. Bare literals everywhere doesn’t have a good rational for me.

I would not be surprised if there are cases of people using /// instead of regular comments inside function bodies. That being said, at least those cases would be straightforward to fix. The main issue with restricting this to only within function bodies is that multi-line regex literals would then be unusable (or require additional #s) at the top level of a main.swift file or playground. And IMO it doesn't seem unreasonable to want to write documentation comments in those cases.

I don't think it's impossible to get /// working in most cases, we might be able to leverage the parser's isStartOfSwiftDecl heuristic to exclude most/all of the cases where you have a documentation comment. You then wouldn't be able to start a multi-line regex with a decl introducer keywords like var, func, or class. However it would add even more complexity to source tooling such as syntax highlighting. Additionally, it's possible the editor may no longer be able to automatically insert /// to continue a documentation comment in case you are trying to write a regex.

Now, this might be a worthwhile tradeoff if it's felt that the delimiter is worth it, but I'm not entirely sure whether it is. While it does nicely mirror """, it doesn't have the same semantics due to whitespace being non-semantic. In that regard, it could be argued that it's more surprising than mutli-line #/. It additionally doesn't have the same term-of-art or conciseness benefits that /.../ has.

5 Likes

13 posts were merged into an existing topic: SE-0354 (Second Review): Regex Literals