RegexBuilder combine multiple Captures

I've been using the Advent of Code as an excuse to get familiar with RegexBuilders.

In Day 4's example we're given a list of values like 2-4,6-8 which should be treated as two ranges. One from 2 through 4, and the other from 6 through 8.

I have a working solution for both parts, but I'd like to refine my parsing logic if possible.

Here's how I'm currently parsing each set of ranges:

  private static func parse(_ string: String) -> [Pair] {
    string.matches {
      TryCapture(OneOrMore(.digit), transform: { Int($0) })
      "-"
      TryCapture(OneOrMore(.digit), transform: { Int($0) })
      ","
      TryCapture(OneOrMore(.digit), transform: { Int($0) })
      "-"
      TryCapture(OneOrMore(.digit), transform: { Int($0) })
    }
    .map { match in
      Pair(
        first: Set(match.1 ... match.2),
        second: Set(match.3 ... match.4)
      )
    }
  }

Is it possible to reuse the 2-4 into Set parsing to write something like this?

  private static func parse(_ string: String) -> [Pair] {
    let set = Regex {
      TryCapture(OneOrMore(.digit), transform: { Int($0) })
      "-"
      TryCapture(OneOrMore(.digit), transform: { Int($0) })
    }
    .mapOutput { (_, lowerBound, upperBound) in
      Set(lowerBound ... upperBound)
    }

    return string.matches {
      set
      ","
      set
    }
    .map { (_, first, second) in
      Pair(first: first, second: second)
    }
  }

mapOutput is mentioned in swift-evolution/0351-regex-builder.md at main · apple/swift-evolution · GitHub, but that code won't compile for me.

I'm using Swift 5.7 in Xcode 14.1 on macOS Ventura 13.0.1.

1 Like

After a quick skim over the Swift standard library (where SE-0351 promises that mapOutput(_:) should be defined) as well as over the Experimental String Processing Library I couldn't find this method anywhere. Maybe it got forgotten in the implementation? @Michael_Ilseman

I also ran into a similar issue while completing the Advent of Code challenge. Based on an example here embedding captures should work, but in my code it did not.

The following code typechecks but produces a runtime error (Could not cast value of type 'Swift.Substring' to '(Swift.Substring, Swift.Int, Swift.Int)')

let regex: Regex<(Substring, ClosedRange<Int>)> = Regex {
    Capture {
        TryCapture {
            OneOrMore(.digit)
        } transform: {
            Int($0)
        }
        "-"
        TryCapture {
            OneOrMore(.digit)
        } transform: {
            Int($0)
        }
    } transform: {
        $0.1...$0.2
    }
}

try regex.firstMatch(in: "2-3")

Oddly enough, I'm also here because of Advent of Code! It was today's challenge (5) that got me to try RegexBuilder; sadly, I don't see any way to get what I'm after in this case. Given this builder code:

let crate = Regex
{
    ChoiceOf
    {
        "   "
        Regex
        {
            "["
            Capture(One(/[A-Z]/))
            "]"
        }
    }
}

let crateLine = Regex
{
    OneOrMore
    {
        crate
        Optionally(" ")
    }
}

...my output looks like this:

("    [D]    ", Optional(Optional("D")), nil, nil, nil)
("[N] [C]    ", Optional(Optional("C")), nil, nil, nil)
("[Z] [M] [P]", Optional(Optional("P")), nil, nil, nil)

The capture output is overwritten with each repetition, and only the last value is preserved. This is typical of regex engines, so I'm not surprised, but I was hopeful that there might just maybe be a way to get a collection back. Alas.

If there's a different way to structure this to get the desired result, feel free to educate me!

--Matt

1 Like

In case anyone was wondering, the extra nils on the end of the output there are from additional regex not shown in the code above (to capture subsequent "move" information from the challenge's input).