String performance: How to ensure fast path for comparison & hashing?

I have a dictionary containing many (mostly-ASCII) file paths as keys. Repeat accesses to the dictionary are fairly slow, with a lot of time spent hashing and comparing these strings. More precisely, string comparisons hit _StringGutsSlice._slowCompare and string hashing hits specialized _StringGutsSlice._foreignWithNormalizedCodeUnitsImpl(outputBuffer:icuInputBuffer:icuOutputBuffer:_:).

The source code indicates that there is a fast path, which requires the object to be in normal form C and to provide "fast UTF8". I have tried enforcing this by using the .precomposedStringWithCanonicalMapping variant of my Strings before sending them to the dictionary, but that didn't lead to a speedup.

Looking at the documentation, it actually seems like the fast path would only ever be hit by "immortal" strings, i.e. short or literal strings. Is that really the case? Is there any way to "prepare" my Strings such that they will hit faster code paths for comparison and hashing?

I have even considered wrapping my keys in something like

struct StringWithHash: Hashable {
  let value: String
  let hash: Int

  init(_ value: String) {
    self.value = value
    self.hash = value.hashValue
  }

  static function == // compare only hashes, ignore `value`
  func hash(into: inout Hasher) // only feed the hash into the Hasher, ignore `value`
}

but would rather avoid that if possible, especially because there could theoretically be hash collisions. Pre-computing the key would work in this special case because the keys are prepared and re-used over and over again in a previous step, i.e. I perform many dictionary lookups using a comparatively small set of keys that I can prepare in advance.

(The platform used is macOS 10.13 with Swift 4.2 compiled by the Swift 5 compiler.)

I wouldn’t do this. What if there’s a hash collision?

That's exactly why I said I'd rather not do it, either...

Would you be able to extract a representative benchmark from this? We'll get you on the fast-paths, but we'd also like to track and improve the slow-path performance as well. There's some (eventually unnecessary) overhead in the machinery here for interfacing with ICU, and we're always looking for more reasons to prioritize Stop using ICU for normalization.

Alternatively or additionally, do you have an instruments trace you can share? bugs.swift.org

The _foreign part means that you have a lazily-bridged NSString that is incapable of providing a pointer+length to UTF-8 contents. In contrast, other strings are fastUTF8 meaning that they can provide such access (small strings spill to the stack to do so).

This specific terminology arose during the implementation, burned into Swift's ABI, albeit in a way that's not exposed to most users. Piercing the String Veil has more details on the internal structure of a String, settling on the terms "opaque" vs "contiguous" instead of "foreign" vs "fastUTF8". SE-0247 Contiguous Strings in 5.1 added API giving you control over this.

As an experiment, try adding in a makeContiguousUTF8() mutating call to your strings. That forces opaque strings to be bridged into a native form, which also captures information such as whether the string is all ASCII (implying NFC).

If you can't use a 5.1 toolchain, you can hack fake it like this, which is a little less optimal than what 5.1 has.

That likely results in a string being bridged to Objective-C and then lazily imported back to Swift, and during the process the information about it being in NFC is lost.

Exposing normalization operations and queries is future work. It's a combination of new Unicode API mentioned in Unicode Enthusiasts Unite, getting off of ICU for normalization Stop using ICU for normalization, and exposing perf-flag operations mentioned further in Piercing the String Veil.

Posts, performance data, benchmarks, and bug reports help us prioritize this higher :-)

No, lifetime is an orthogonal concern and you're looking at the wrong bit. This is the bit for whether it's fast.

The makeContiguousUTF8() API is the best we have for now (keeps you off the really slow paths), until we add the normalize() API as well.

There's also a fair amount of bridging performance improvements in Swift 5.1 that may help some of your related code. CC @David_Smith.

2 Likes

Of particular note, the OS version matters as well as the Swift version: I fixed many of the cases on the Objective-C side that lead to NSStrings that don’t support the fast-ish paths in the current beta versions of the OSs.

1 Like

Thank you for the elaborate response!

Here's a trivial benchmark, but it might be hitting slightly different code paths because keys6 takes exactly as long as keys5. In any case, it seems like the bottleneck is this case is the inability to skip ICU normalization rather than the string being "foreign". In my production code I am using methods like (string as NSString).removingLastPathComponent, which could cause the "foreign" storage. Do you think using the corresponding methods on URL could help, or would those implicitly bridge to NSString as well?


extension String {
	mutating func makeNative() { self += "" }
}

		let keyRange = 0..<1000
		let keys1 = keyRange.map { "\($0)" }  // 0.1297362699988298 s
		let keys2 = keyRange.map { "ä\($0)" }  // 1.216462821001187 s
		let keys3 = keyRange.map { String(repeating: "abc", count: 7) + "\($0)" }  // 0.19855447000009008 s
		let keys4 = keyRange.map { String(repeating: "äabc", count: 7) + "\($0)" }  // 6.444108501003939 s
		let keys5 = keyRange.map { String(repeating: "äabc", count: 7) + "\($0)" }  // 6.4144067209999776 s
			.map { $0.precomposedStringWithCanonicalMapping }
		let keys6 = keys5.map { (key: String) -> String in  // same performance as keys5
			var copy = key
			copy.makeNative()
			return copy
		}
		var dict: [String: Int] = [:]
		dict.reserveCapacity(keyRange.count)
		
		let start = CACurrentMediaTime()
		for i in 0..<2_000 {
			for key in keys5 {  // Replace this with keys1...keys5
				dict[key] = i
			}
		}
		print("done in \(CACurrentMediaTime() - start) s")

Unfortunately, my users are on 10.11 till 10.14. My understanding is that I can't use these methods there because they are part of the Swift 5.1 stdlib? It would be painful to not have a workaround for all my customers on 10.11 till 10.14.

I will try tomorrow if the makeNative() hack makes a difference in my production code — sadly it doesn't for the benchmark above.

Ouch! is there any way to retain the "NFC" status? If the status would not be lost, am I at least using the correct method? :sweat_smile: Would it make a difference if I combined this with the makeNative() hack? (Reading your reply, it sounds like a normalize method that would be "stronger" than precomposedStringWithCanonicalMapping would still be needed, though.)

Alternatively, if A == B holds if and only if A.utf8.precomposedStringWithCanonicalMapping == B.utf8.precomposedStringWithCanonicalMapping, I could export the precomposedStringWithCanonicalMapping.utf8 variants to Data objects and compare these, i.e. a poor man's way of enforcing the knowledge of the string being canonical during the comparison. That would be a pretty terrible workaround and double memory usage (because I need to store a copy of the string in a Data structure), though.

Sorry, I indeed misread that code :sweat_smile: That's good news, though.

Ah, this explains it. That method creates an NSPathStore2, a private subclass of NSString for dealing with path manipulation. In that version of the OS, NSPathStore2 was missing a particular fast bulk accessor that Swift String uses when available.

Is there a way to enforce getting a real String from that? I've tried

path += ""

But that didn't seem to make a difference. Same for telling the system that a particular string is normalized?

Also, does URL use the same methods under the hood, or could that make a difference?

Hm. I would expect appending an empty String to do the trick. I don't recall what the URL implementation looks like.

I expect we're probably around the limits of what we can guess at, and need to gather more data to get further.