I'm trying to replace NSRegularExpression with Regex, but there is one situation that I can't figure out how to solve using Regex. Here's what I'm trying to do:
Given the string "aaa" I want the pattern /^a/
to match if scanning start at pos 0,
but not if scanning starts at pos > 0
A pattern that doesn't match at the start of the scan range should fail immediately (prefix match)
Using NSRegularExpression it is possible using a combination of options/range arguments. A code example:
import Foundation
let str = "aaa"
let nsregex = try! NSRegularExpression(pattern: "^a", options: [])
// case 1
let case1Range = NSRange(location: 0, length: str.count)
let case1 = nsregex.firstMatch(in: str, options: [.anchored, .withoutAnchoringBounds], range: case1Range)
print("case1", case1?.range ?? "No match")
// case 2
let case2Range = NSRange(location: 1, length: str.count-1)
let case2 = nsregex.firstMatch(in: str, options: [.anchored, .withoutAnchoringBounds], range: case2Range)
print("case2", case2?.range ?? "No match")
let regex = /^a/.anchorsMatchLineEndings(true)
// case 1
let case1b = try! regex.prefixMatch(in: str)
print("case1b", case1b?.output ?? "No match")
// case 2
let case2bRange = str.index(after: str.startIndex)..<str.endIndex
let case2b = try! regex.prefixMatch(in: str[case2bRange])
print("case2b", case2b?.output ?? "No match") // D'oh
The result of running the code is
case1 {0, 1} // OK
case2 No match // OK
case1b a // OK
case2b a // Not OK
Is there some way to get the desired behaviour from Regex or do I have to stay with NSRegularExpression?
FWIW patterns are read from file (and are beyond my control) so RegexBuilder is not an option.
The withoutAnchoringBounds option that this behavior relies on is an NSRegularExpression-specific feature. There isn't a corresponding one in Swift's Regex – a substring is always the full source in which to pattern match, and a start-of-subject anchor like ^ is always going to match at the beginning of the substring.
I don't know the full parameters of your use case, but one fix that I can see is to implement a custom regex component that only succeeds when at the start of the base string. (You mention that you can't use RegexBuilder, but you should be able to use a builder regex to compose your custom component with your sourced-from-elsewhere regex.)
import RegexBuilder
// A regex anchor that matches only at the start of a base string.
struct StartOfBaseStringAnchor: CustomConsumingRegexComponent {
typealias RegexOutput = Substring
func consuming(
_ input: String,
startingAt index: String.Index,
in bounds: Range<String.Index>
) throws -> (upperBound: String.Index, output: Substring)? {
if input.startIndex == index {
(index, input[index..<index])
} else {
nil
}
}
}
// <snip from above>
let customRegex = Regex {
StartOfBaseStringAnchor()
regex
}
// case 1
let case1c = try! customRegex.prefixMatch(in: str)
print("case1c", case1c?.output ?? "No match")
// case 2
let case2c = try! customRegex.prefixMatch(in: str[case2bRange])
print("case2c", case2c?.output ?? "No match")