Interesting to see regular expressions bubble to the surface again. It’s clearly an itch evolution needs to scratch. The NSRegular expression class is not at all as lightweight as regular expressions deserve to be and while it could be improved with a few well chosen extensions its use of NSString under the covers makes it subject to speed regressions if you try to process large amounts of data. I’d like to throw a few ideas into the mix.
First, for me, the focus is not regex literals themselves that is interesting though it wold be nice if they could be validated at compile time by and flagging them with a special syntax. /regex/ seems a good precedent to pick up if not particularly Swifty. It would be subject to many of the escaping concerns of strings themselves so I’d suggest the option of an analogue to raw strings of something like #/regex/# to use in practice.
What is of interest is how regexes combine for the basic operations of match, iterate and replace when operating on a target string and I’d like to one more time, float the idea of using subscripting into a string with a regex. Why on earth subscripts I hear you say? They have the advantage that they are atomic and not subject to operator precedence but also they have a unique property that while they are effectively a function call they are also one that can also be assigned to using its setter. Bear with me, certainly it’s an idea that takes some getting used to but remember there was once a time when subscripts where only for arrays and and accessing dictionary values using subscript syntax was novel and perhaps non-intuitive. Not currently an idiom in any language I’m aware of, there’s almost a mangled logic to it — what else would you use to refer to a non-trivial range in a String other than a regular expression?
Fleshing out the idea...
Since generic subscripts became available in Swift it possible to concoct something along the lines of
let datePattern = #"(\d{4})-(\d{2})-(\d{2})"#
let date = "2018-01-01"
if let (year, month, day): (String, String, String) = date[datePattern] {
print( year, month, day )
}
Using the symmetry of subscripts its possible to write a tuple back into a string
var date2 = "0000-00-00"
date2[datePattern] = ("2018", "01", "01")
XCTAssertEqual(date, date2)
Finally, for iterating the following also works:
let dates = "2018-01-01 2019-02-02 2020-03-03"
for (year, month, day): (String, String, String) in dates[datePattern] {
print( year, month, day )
}
This isn’t a flight of fantasy. All these constructs already work with Swift as-is using this package (7 stars ). Sure, it would be nice to type check the number of groups in a pattern matches the number of elements in the tuple being used and perhaps have named capture groups or even types other than String but the two ideas are complementary and literals with more smarts can be worked into a more ambitious plan later.