SE-0355: Regex Syntax and Runtime Construction

You can work around it, poorly, by making sure prefix doesn't contain the subsequence \E and wrapping it in a \Q...\E.

I think the better general solution is to support regex interpolations, which is future work.

1 Like

My understanding, (please correct me if I'm wrong @rxwei @nnnnnnnn), is that with the soon-to-be-revised DSL's mapOutput:

func buildPrefixExpression(_ prefixStr: String) throws -> Regex<(Substring, suffix: Substring)> {
    Regex { 
      prefixStr
      Capture { /.*/ }
    }.mapOutput {
      ($0, suffix: $1)
    }
}
1 Like

That's right — this kind of control over composition is one of the primary motivations for creating the RegexBuilder approach to building regexes.

That's fantastic!

.mapOutput(..) will be a heavy hitter for loads of regex code for sure.

1 Like

I fully support the work being done here and it looks really good. But I can't vote on it. I can't provide valid detailed feedback on the proposal because of my limited exposure and actual use case for many of the advanced and somewhat problematic corners of regex syntax and the unification and Unicode full adoption effort. A huge amount of work have been done, but I am afraid it might be too soon to commit to this at the standard library level and make it subject to source break rules.

I didn't get a chance to read the responses, so please accept my apology if this question is duplicate:

If this proposal is accepted and released, are we going to be locked out of breaking changes to the syntax until Swift 7? Strings with this literal syntax might be stored externally. If we do make a breaking change, will compiler and Xcode be able to help migrate the existing strings? Especially if we create the string at runtime using literal string fragments plus dynamic runtime information (such as user provided word to match)

No. There are several mechanisms available that could assist us in doing a migration if we had to (though I don't think that we will). The first one that came to mind is that rather than migrate existing strings, we would continue to support the existing syntax via a labeled Regex(swift5_7syntax: pattern) or similar, and migrate existing unlabeled inits to that via tooling. I can think of a few other ways to address it as well, so I do not believe we would have painted ourselves into a corner.

1 Like

Good to hear. How about the ABI?

We'd be able to do a similar thing at the ABI level so that already-compiled code continued to see the same behavior.

3 Likes

Great. In that case I am fully +1 on this.