Does dotless i match I with Turkish locale in caseinsensitive regular expression

Given this code:

import Foundation

extension String {

func matches( _ pattern : String) -> Bool {
    return self.range(of: pattern, options: .regularExpression, locale: Locale(identifier: "tr")) !=  nil 
    }
}

print("f\u{0131}le".matches("(?i)^[a-z]+$")) // returns false
print("f\u{0131}le".matches("(?i)^[a-z,\u{0131}]+$")) // true
print("f\u{0131}le".matches("(?i)^[fıle]+$")) // true
print("FLE".matches("(?i)^[fıle]+$")) // true
print("FILE".matches("(?i)^[fıle]+$")) // false

I would have expected that FILE would match "(?i)^[fıle]+$" that is with a dotless-i given the Turkish locale. What am I missing?

1 Like

I haven't tested it but I guess you have to change your options parameter to

[.regularExpression, .caseInsensitive]

At least the documentations says that it should work:

For example, for the Turkish locale, case-insensitive compare matches “I” to “ı” ( U+0131 LATIN SMALL DOTLESS I ), not the normal “i” character.

See: NSString - range(of:options:range:locale:)

It doesn't work any better with

[.regularExpression, .caseInsensitive]

in any case, (?i) is supposed to do the same and does for

print("FLE".matches("(?i)^[fıle]+$")) // true

Then the actual behaviour is at odds with the documentation. That is a bug one way or the other. File one for Foundation: bugs.swift.org.

Until it is fixed, you may be able to work around it by handling ı and/or separately.

Terms of Service

Privacy Policy

Cookie Policy