Declarative Regex using ResultBuilder

Hello, Swift community!

Now that Swift 5.4 is released, I would like to share with you what I have built with ResultBuilder as a way for me to understand how SwiftUI DSL works and ease my pain with Regex :grinning:

Most of us, Swift developers, are not using and creating regular expressions on day to day basis. But each time we have to, we rely on web search, old documentations. Then we have to deal with unsafety and perform many runs before achieving the expected result. It feels like a heavy rollback of what we are used to when coding with modern language such as Swift.

This brings SwiftRegexDSL, a Declarative Structured Language for regular expressions. The DSL provides readable expressions, far more suitable for composition, control flows, in addition to bringing compile-time checks. To summarise, fewer headaches with regex!

struct ThisIsARegex: Regex {
  let shouldMatchLine: Bool

  var body: Regex {
    "Hello"
    WhiteSpace()
    "World,"
    if shouldMatchLine {
      Line()
    }
    AnyCharacter()
      .oneOrMore()
  }
}

let regex = ThisIsARegex(shouldMatchLine: false)
"Hello World, how...".match(regex) // true
10 Likes

I have something similar for JSON in mind. Just out of curiosity, have you explored what the dis-/advantages would be if Regex would have an associatedtype Body: Regex? :thinking:

1 Like

Welcome @JeremyMarchand and thank you for sharing!

Having special characters as public structs ins't ideal in regard to scope pollution (SwiftRegexDSL .Form may clash with SwiftUI.Form as an example). Furthermore, you really need to use WhiteSpace() or Line() only in the body of a Regex-conforming type. As a suggestion, have you considered having them as properties on the protocol instead?

extension Regex {
  var whitespace:   some Regex { UnsafeText(#"\s"#) }
  var newLine:      some Regex { UnsafeText(#"\n"#) }
  var anyCharacter: some Regex { UnsafeText(#"."#) }
}

This way you don't have pollution (whitespace and newLine will only be available in the struct scope) and you get autocompletion (typing self.w...) too:

struct ThisIsARegex: Regex {
  let shouldMatchLine: Bool

  var body: some Regex {
    "Hello"
    whitespace
    "World,"
    if shouldMatchLine {
      line
    }
    anyCharacter
      .oneOrMore()
  }
}

Nit: all the body properties in your code base have type Regex, which is a protocol type and thus instantiates an existential container. some Regex would be a better choice in terms of performance :slight_smile:

2 Likes

Yes I think about it and I have chosen over the simplicity of implementation, associated type can lead to very verbose implementation. But I think it could be interesting as it will be better for modifiers/extensions for instance: the specific modifiers or implementations attached to a regex could be accessible when wrapped into another Regex. The code for quantifier and group could be improved for instance. I will probably do these (breaking) changes.

edit:
The advantage of not using associated type, is mainly that in your builder you just need to define one buildBlock method and your regex is not limited in size. Whereas with associated typed you need to define multiple build block methods like ViewBuilder (see all the generic buildBlock they have made).

2 Likes

@xAlien95 Thank you.
I have defined them as Struct to keep consistency in the DSL, not mixing small cased variables with Struct regex that have parameters (or not). This will give something like that I feel lack homogeneity

struct ARegex: Regex {
    var body: some Regex {
        "Hello"
        whitespace
        OtherRegex()
        Group {
            anyCharacter
            ChildRegex(param:true)
        }
    }
}

Every component should be understood as a Regex.

I understand the point about scope pollution, Text is definitely colliding. But I actually did the same choice as SwiftUI by relying on the package/framework namespace. I see this as something like a view that you define in an independent file with few mixing imports between your app layers (not importing SwiftUI in the same file).
Also, it allows to define Regex outside Regex type, as a variable without the need to specify the namespace.

@ViewBuilder
var someVar: some Regex {
....
}

I am curious about the point about performance regarding associate type and existential container. I will look at it. :+1:

Just one small note. I don't know how you implemented Group but in SwiftUI it serves several purposes:

  • re-applying outer modifier to each of its child
  • work around the limitation of a tuple of max 11 values
  • injecting the Group's child into the Group's parent view

Could you show an output to console print(regex)?
It should be easy to read, right? ( Ideally, it should contain "readable" presentation for a human and the regex itself ).

@DevAndArtist Since the Group is a fundamental component of Regex language (Group, Capture, ...), my implementation is not the same as SwiftUI Group at all. But If I switch to associated types (currently in WIP/test), I will definitely need the equivalent of a SwiftUI group. Maybe introducing something called a VirtualGroup will work. Applying the modifier to each element is something nice to have but I don't feel is that useful in regex context. If I implement it, I just need to make sure it does not enter in conflict with the behavior I expect with custom regex struct: the body should be automatically wrapped in a group to apply the modifier on the whole regex and not on each element.

@lolgear This is something I should work on, I will include the type name and produced regex string in the coming days. For the print, I think it is better to not go further. But for a full dump, I am considering reproducing the full body structure.