I've been working on this for several hours, and I still don't believe it, but I'm pretty sure of what I've found. It's hard to post a complete example, but I'll distill it down to the relevant parts.
// read in about 6K different file names, all of the form
// /some/long/prefix/<fileNameA>
// /some/long/prefix/<fileNameB> (etc)
let pathNames: [String] = <READ IN A WHOLE BUNCH OF FILE NAMES>
// dirName has the directory prefix of each of the read in files,
// with a trailing slash.
let dirName "/some/long/prefix/"
let slowFileNames = pathNames.map { ($0 as NSString).lastPathComponent }
let fastFileNames = pathNames.map { $0.replacingOccurrencesOf(dirName, "") }
// both of the above now look like [<fileNameA>, <fileNameB>, ...]
for i in 0..<slowFileNames.count {
if slowFileNames[i] != fastFileNames[i] {
fatalError("horrible death")
}
}
// ok, we got here, so i'm not an idiot: the arrays are the same.
// note: both of the arrays are of type [String]. slowFileNames is NOT [NSString].
let x1 = fastFileNames.sorted() // takes 0.02 seconds
let x2 = slowFileNames.sorted() // takes 0.82 seconds. WTF??
The only difference in the arrays fastFileNames and slowFileNames is how they were built, but at any level I can measure, they are identical. However, one sorts 40 times faster than the other.
I can reproduce this all day long, on iOS or macOS with Xcode 11.3.1.
The clear implication is that at a functional level, the two arrays have the same content, but under the hood, the strings in slowFileNames have some awfullness that the strings in fastFileNames do not.
Why is this is happening? Can someone suggest how I could poke at the internals of the strings in the debugger or infer in some other way what the difference is?
This is frightening, that merely messing with the NSString API can hurt you this badly, since there are lots of spots we depend on it.