I've been playing with Regex lately and while I love it, I may be missing something, but it seems to be that the named captures cannot be fetched dynamically - let me give you an example:
Fetching some data from some string - for example, you have lines in a log and you need to fetch some ID from various lines.
The issue here is that the strings have evolved over time, so it's not feasible to fit all possible formats, so you have multiple regexes that have one thing in common - there's a capture named "id". Examples:
/particle-(?<id>\d+)/
/uuid: (?<id>[a-f\d]+)/
This works great for small regexes like the above whose result type is (String, id: String)
, so you can put them into an array and iterate.
The issue arises when one of the regexes adds an additional capture:
/(foo|bar)=(?<id>\d+)/
Suddenly, the result type is (String, String, id: String)
and this regex can no longer be added into this list.
Yes, it's an easy fix for this particular example:
/(?:foo|bar)=(?<id>\d+)/
... by making the first group non-capturing, but... This will not work in case I have two groups that I want to capture:
/(?<name>particle)-(?<id>\d+)/
/(?<name>uuid): (?<id>[a-f\d]+)/
Now what about
/(?<id>\d+): (?<name>\w+)/
The reversed order of ID and name disqualifies the regex from being in an array with the others - even though it matches exactly the same-named groups...
I've thought about this and came up with a few possible solutions:
- Dynamic look-up:
let myRegexes: [some RegexComponent] = [...]
for regex in myRegexes {
guard
let match = foo.wholeMatch(of: regex),
let id = match["id"], // String
let name = match["name"] // String
else {
continue
}
...
}
The regex match would allow to fetch the value of a named group by name. If the match doesn't contain it, nil would be returned - or error thrown...
- Auto-conforming protocols:
@autoconforming
protocol NameAndID {
var id: String { get }
var name: String { get }
}
let myRegexes: [Regex<NameAndID>] = [ ... ]
The idea here is to define an interface that values automatically conform to if they have the fields defined and the conformation would be emitted by the compiler when they are passed to a method that requires the conformation.
The reason for this is that the output of the regex are tuples that cannot be (to my knowledge) extended to conform to certain protocols. This way, the (String, id: String, name: String)
would automatically conform to NameAndID
, (String, name: String, id: String)
would do so as well, even though the id
and name
fields are in different order, but also so would (String, name: String, String, String, id: String)
as well.
Or am I simply missing some very simple solution?