When I use word boundaries in a regular expression pattern, words containing a quote are lost.
For exemple, with the following code in an Xcode playground.
var pattern = "\\b\\w+\\b"
var sentence = "Let's go!"
print("sentence: \(sentence)")
print("pattern: \"\(pattern)\"")
var range = NSRange(sentence.startIndex..., in: sentence)
var regex = try! NSRegularExpression(pattern: pattern, options: [.useUnicodeWordBoundaries])
matches = regex.matches(in: sentence, options: [], range: range)
matches.forEach({ match in
guard let subrange = Range(match.range(at: 0), in: sentence) else {
return
}
print(sentence[subrange])
})
pattern = "\\w+"
print("pattern: \"\(pattern)\"")
regex = try! NSRegularExpression(pattern: pattern, options: [.useUnicodeWordBoundaries])
matches = regex.matches(in: sentence, options: [], range: range)
matches.forEach({ match in
guard let subrange = Range(match.range(at: 0), in: sentence) else {
return
}
print(sentence[subrange])
})
With the pattern \b\w+\b
the "Let's" part of the sentence is lost but not with the pattern \w+
. That's not the case with the pattern \w+
. I don't understand why "Let's" is lost with the first pattern. I have no such problem with python regex.
(macOS 11.6 20G16, Xcode 12.5.1 12E507 , Swift 5.4.2)