TLDR; Swift's various -insensitive searches are inadequate for searching for results in lists presented to users.
I use an iPad app that manages a library of songs. I was having some trouble with the search feature and an attempt to show the developer how it ought to work led me down this rabbit hole. The simplest version of this problem is:
Currently someone typing the search string “Bo's So” will never find a song called “Bo’s Song” in their library, because the punctuation characters are slightly different.
I'm not interested in calling either style “wrong;” they both occur all the time in practice and in some cases one or the other may be very difficult to type on an iPad.
This isn't a case, diacritic, or width difference, so none of the standard tools for normalizing strings or localized insensitive search work. The simplest hack I can think of is to strip all punctuation from both strings before doing a containment check.
But the problem is knottier than that simple hack can handle: I work with lots of Hawaiian songs, whose names often include ʻokinas (sounds like a glottal stop and is considered a letter in Hawaiian). Now, the official glyph for an ʻokina is Unicode 02BB, MODIFIER LETTER TURNED COMMA, which as its name implies is suitably classified as a modifier letter, not as punctuation. On an iPad that character appears to be impossible to type (even with a Hawaiian keyboard, unlike on iPhone!), and anyway even if it were possible but inconvenient, everyone is going to type some convenient and similar-looking punctuation character instead. Because it isn't punctuation, the hack described above won't work: you'll never find "ʻuliʻuli" (with ʻokinas) by searching for "'uli'uli" (with apostrophes) because "uliuli" is not a substring of "ʻuliʻuli".
Looking for a general solution, I checked the Unicode category of modifier letters, and found that many don't look like punctuation, so a quick hack based on the unicode category isn't viable. I ended up special casing ʻokinas to treat them as punctuation but it was pretty unsatisfying. It seems to me that a general solution would normalize each modifier letter into some other, less esoteric character that appeared similar.
All that said, the Modifier Letters category is an artificial grouping. I bet there are plenty of other Unicode characters that are hard to type and are commonly represented as other easy-to-type characters. My general claim is that Swift should have facilities for better handling this use case, and my justification for it being in scope is that exactly the same motivation drives the existence of diacritic- and width-insensitive searching. It shouldn't be up to this song library application to discover which set of hacks will work out in practice because the book library application author will have to discover the same set.
Maybe the semantics of these searches ultimately ought to be specified by Unicode, but Swift can lead the way.
Thoughts?