[Pitch] Regex builder DSL

rxwei · March 18, 2022, 11:54pm

I would say that an array literal may very likely be confused with concatenation, despite the similarities to textual regexes. I think we could fold it under ChoiceOf. But as mentioned earlier, character classes are out of scope for this proposal.

A map would have very different semantics. With map you typically want to transform the entire generic parameter Output. However, Capture's transform only applies to the subpattern captured. More concretely:

// Non-nested capture
Capture {
  OneOrMore(.word)
} transform: { /* transform the captured content to T */ }
// => Capture<(Substring, T)>

// Nested capture
Capture {
  Capture {
    OneOrMore(.word)
  } transform: { /* transform the inner captured content to U */ }
  OneOrMore(.word)
} transform: { /* transform the outer captured content to T, NOT including the inner */ }
// => Capture<(Substring, T, U)>

But of course we could define a mapFirstCapture(_:), overload it 10 times and let it transform the first capture (T in the example above), but I think that would be less clear than having a transform: parameter.

TryCapture causes matching to fail if the transform closure fails, whereas Capture always succeeds. I think it comes down to clarity, and I don't feel too strongly about one way or another.