Let’s chat first-class regular expressions

Now that Swift is rounding the corner of ABI stability, how does the community feel about beginning the work of adding first-class regular expressions to the language? Is this the right time? What are the implementation challenges that await us? What are some regex API’s that inspire you? What are the API’s that make you cringe?

7 Likes

You might want to check this thread out: String Consumption

4 Likes

Thanks for the link @griotspeak. It’s exciting to discover that a well-framed and thoughtful discussion is already under way.

1 Like

My dream, beyond what the linked thread is about, is the ability to combine the opposite of StringInterpolation, regexes, and value initialization into one pattern. A simple version, given some type:

struct Coordinate {
    let x: Double
    let y: Double
}

to be able to say:

let string = "(3.422232, 5.23552)"
let coordinate: Coordinate = string.extracting("(\.y, \.x)")

or something along those lines (I haven't actually designed the syntax here). Basically, given an input string, being able to describe how to initialize a value from that string. More advanced uses with regexes could also be explored where you can extract a list of matches from a string using a regex and then easily map those values into other types. But being able to consume a string without having to write any of the parsing code, just writing how the string maps to values, would be very powerful.

7 Likes

That would be the PEGs section.

2 Likes

That would be the PEGs section.

Thanks for sharing that tid-bit, @Michael_Ilseman. Though it may be helpful for future readers to know that, as best I understand it, the PEG sections in both of your posts only discuss the theoretical advantages (and disadvantages) of PEGs vs regexes without any sample syntax. That being said, I have enjoyed reading everything you've written on the topic. It's given me a lot to think about! :slight_smile:

FWIW, I found this JavaScript-based PEG library to be an excellent primer on the subject. For a library in another language, this website can surely point you in the right direction.

1 Like

I think it would be too early and distracting to start putting in straw-man PEG syntax. Even for regexes, I intentionally avoided that this time around, but you can imagine something similar to what I posted last time.

Some links that were listed in String Consumption, but I'll highlight them again:

Perl 6's Grammars, which demonstrate a language-level integration.

Pegged for D, which demonstrates a library approach with compile-time integration and automatic cycle breaking for left-recursion. The reason I like linking to that library is, because like RE2, the tutorial/guide is excellent.

A paper that your link doesn't mention, but I think is interesting is Parsing Expression Grammars Made Practical, which uses annotations on expressions to make left-recursion, associativity, etc., possible and much easier to express. Manually doing sequences, associativity, and precedence in the grammar can be obnoxious and makes it harder to reason through. When you throw in error handling (ideally though explicit error nodes and some parse-state signaling), grammars can grow complex without these simple helpers.

I realize this is all unsatisfactorily abstract and vague at this point. We have fundamental building blocks to do first.

5 Likes

Any progress?
I come here because I think current Swift regular expressions suck.
Javascript & Python have much better syntax on regex.
use NSRegularExpression seem over complicate.

I search "regular expressions" on this forum
https://forums.swift.org/search?q=regular%20expression
and found 3-5 discussion. which is good.

I hope next Swift version(Swift 5.1?) can make regex API much better.

Thanks