I have a dictionary containing many (mostly-ASCII) file paths as keys. Repeat accesses to the dictionary are fairly slow, with a lot of time spent hashing and comparing these strings. More precisely, string comparisons hit _StringGutsSlice._slowCompare
and string hashing hits specialized _StringGutsSlice._foreignWithNormalizedCodeUnitsImpl(outputBuffer:icuInputBuffer:icuOutputBuffer:_:)
.
The source code indicates that there is a fast path, which requires the object to be in normal form C and to provide "fast UTF8". I have tried enforcing this by using the .precomposedStringWithCanonicalMapping
variant of my Strings before sending them to the dictionary, but that didn't lead to a speedup.
Looking at the documentation, it actually seems like the fast path would only ever be hit by "immortal" strings, i.e. short or literal strings. Is that really the case? Is there any way to "prepare" my Strings such that they will hit faster code paths for comparison and hashing?
I have even considered wrapping my keys in something like
struct StringWithHash: Hashable {
let value: String
let hash: Int
init(_ value: String) {
self.value = value
self.hash = value.hashValue
}
static function == // compare only hashes, ignore `value`
func hash(into: inout Hasher) // only feed the hash into the Hasher, ignore `value`
}
but would rather avoid that if possible, especially because there could theoretically be hash collisions. Pre-computing the key would work in this special case because the keys are prepared and re-used over and over again in a previous step, i.e. I perform many dictionary lookups using a comparatively small set of keys that I can prepare in advance.
(The platform used is macOS 10.13 with Swift 4.2 compiled by the Swift 5 compiler.)