I have a table named WordBox
that contains a column named menukad
. I can filter the wordboxes for which menukad
s start with the string $searchText
with this simple query:
static func matchingMenukads(_ searchText: String) -> QueryInterfaceRequest<WordBox> {
var query = WordBox
.filter(Column("menukad").like("\(searchText)%"))
}
Before going further, just some context:
in hebrew, a menukad is a word written with its nekkudot. What is it ?
As some may know, in hebrew, the vowel sounds are written as little dots or bars above or below the actual characters, (eg like so ָ ֲ ). These are called nekkudot (which mean "dots" in hebrew).
Example for the verb 'to eat' (pronounced le'ekhol):
לֶאֱכוֹל with nekkudot
לאכול without nekkudot
When we write in hebrew, we never use the nekkudot.
I explain all of this because I am building a search engine that should look for $searchText
inside several columns of my database, including the menukad
column.
The menukad
column contains the words with nekkudot but, as we rarely write the nekkudot in hebrew, the $searchText
will often not contain nekkudot. So I need to be able to match $searchText
with menukad
s but without nekkudot.
Technically, nekkudot are encoded as UTF-8 characters so you can write a String
method that removes the nekkudot from its string:
import Foundation
extension String {
func removeNekkudot() -> String {
let pattern = "\\p{M}" // matches Unicode characters classified as "marks," which includes diacritics such as nekkudot in Hebrew
let regex = try! NSRegularExpression(pattern: pattern, options: [])
let range = NSRange(location: 0, length: self.utf16.count)
return regex.stringByReplacingMatches(in: self, options: [], range: range, withTemplate: "")
}
}
So I want to match my $searchText
with menukad
with or menukad
without nekkudot
First solution: I could duplicate the menukad
column and remove all the nekkudot from it. It would work but I would duplicate data...
Question
So I would like to know if I coukd use removeNekkudot
to efficiently match my menukad
fields with my searchText directly in the SQL ? Is that feasible using GRDB ?