hassila
(Joakim Hassila)
February 1, 2023, 7:55am
35
Thanks for the clarification @dgoldsmith (we missed parse()
somehow, it was the missing piece for us - EDIT: in fact, looking at the pitch, I can't find it, perhaps it can be added to the Tree Walking section as an example? It would be clearer than spotlightQuery
at least for us) - we've discussed the pitch with our team internally and overall we think it would be a great addition and definitely would make heavy use of this, looks really promising.
Our only remaining major concern (which obviously is hard to know from a pitch, but please view this as an open question) would as mentioned be performance (of evaluate()
- specifically the use of keypaths, as even though SE-061 have the following note on performance:
The performance of interacting with a property/subscript via KeyPaths should be close to the cost of calling the property directly.
There are a few references so far to (quite significant) performance issues with the current keypath implementation, e.g.:
opened 01:08PM - 22 Nov 18 UTC
bug
performance
compiler
| | |
|------------------|-----------------|…
|Previous ID | SR-9323 |
|Radar | rdar://problem/52529589 |
|Original Reporter | @weissi |
|Type | Bug |
Attachment: [Download](https://user-images.githubusercontent.com/2727770/164963044-daaa6692-c66d-47a3-b822-a02930633670.gz)
<details>
<summary>Additional Detail from JIRA</summary>
| | |
|------------------|-----------------|
|Votes | 14 |
|Component/s | Compiler |
|Labels | Bug, Performance |
|Assignee | None |
|Priority | Medium |
md5: 8a7995ec006b266bd138d8a19ef4ebed
</details>
**is duplicated by**:
* [SR-11983](https://bugs.swift.org/browse/SR-11983) KeyPath performance below expectation compared to alternative (see test)
**Issue Description:**
## Description
Naively I always assumed KeyPaths are quite fast. In my head they were basically a tuple of two function pointers (a getter and a setter, both not capturing so `@convention(thin)` like) that get just handed around and applied. At least I assumed that would be some sort of fast path when it has enough information at compile-time.
To see how fast/slow keypaths are, I made a quick benchmark which just incremented a struct member (`Int`) 100 M times. To have an idea what the theoretical maximum is, I compared that to a version that doesn't use key paths at all and just does \`thing.x += 1\` in a loop. (I checked the assembly and the compiler does spit out every single increment (for overflow checking) it does however unroll the loop five times). Anyway, the result is:
time taken for 100M direct increments (in s)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.02154 0.02355 0.02358 0.02613 0.02450 0.03828
Now I benchmarked that against KeyPaths
``` java
var thing = SomeStruct()
for _ in 0..<100_000_000 {
thing[keyPath: \SomeStruct.x] += 1
}
```
and the result is:
time taken for 100M keypath increments (in s)
Min. 1st Qu. Median Mean 3rd Qu. Max.
4.691 4.698 4.722 4.738 4.778 4.821
which is 200x the runtime of the original one. I used `Apple Swift version 5.0-dev (LLVM cbe8d5e28f, Clang 3452631569, Swift 201dcba300)`, before that it was even slower.
Then I tried to understand why Keypaths are so slow and I created yet another benchmark which goes through a pretty naive approximation:
``` java
public struct SomeStruct {
public var x: Int = 0
public init(){}
}
public struct FakeWritableKeyPath<Thing, Element> {
public let writeIt: (inout Thing, Element) -> Void
public let readIt: (Thing) -> Element
}
extension SomeStruct {
public static let fakeKeyPathsForX: FakeWritableKeyPath<SomeStruct, Int> =
FakeWritableKeyPath(writeIt: { thing, newValue in thing.x = newValue },
readIt: { thing in return thing.x })
}
```
and the loop was
``` java
for _ in 0..<100_000_000 {
let read = SomeStruct.fakeKeyPathsForX.readIt
let write = SomeStruct.fakeKeyPathsForX.writeIt
write(&thing, read(thing) + 1)
}
```
to my absolute surprise, that yielded better performance ("only" 47x slower):
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.073 1.091 1.103 1.116 1.131 1.217
To finish off, I benchmarked against what I thought would kind of approximate the implementation at least in a fast path (just handing two function pointers around):
``` java
public struct FakeCheatedWritableKeyPath {
public let writeIt: @convention(thin) (inout SomeStruct, Int) -> Void
public let readIt: @convention(thin) (SomeStruct) -> Int
}
extension SomeStruct {
public static let fakeCheatedKeyPathsForX: FakeCheatedWritableKeyPath =
FakeCheatedWritableKeyPath(writeIt: { thing, newValue in thing.x = newValue },
readIt: { thing in return thing.x })
}
```
with the loop just like above. That started to yield reasonable performance
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.2298 0.2329 0.2351 0.2362 0.2401 0.2440
which is only about 10x slower than the direct additions and I think that's reasonable because `INC` is a processor instruction which naturally is a bit faster than 'function call to read the value, increment, function call to write the value'. Also loop unrolling etc...
## Notes
### Compiler
Apple Swift version 5.0-dev (LLVM cbe8d5e28f, Clang 3452631569, Swift 201dcba300)
Target: x86_64-apple-darwin18.2.0
### OS
macOS 10.14 on
Model Identifier: MacBookPro15,1
Processor Name: Intel Core i9
Processor Speed: 2.9 GHz
### Observations
I found {{ %5 = keypath $WritableKeyPath\<SomeStruct, Int\>, (root $SomeStruct; stored_property \#SomeStruct.x : $Int) // users: %20, %8}} in the SIL which looks like the compiler has actually quite some understanding of key paths, so maybe there's hope they will soon be faster? 😉
### Code
the structs/fake key paths were defined in a module `Foo` and all the calls were always from another module `TestApp` in order not to get any inlining effects. But even with everything in one module, the slow versions didn't get faster at all.
the whole code is attached (the .tar.gz and can be run with just `swift run -c release`), but is also here (note that everything below `// MODULE: Foo` is in another module.
``` java
import Foundation
import Foo
public func measure(_ fn: () throws -> Int) rethrows -> [TimeInterval] {
func measureOne(_ fn: () throws -> Int) rethrows -> (TimeInterval, Int) {
let start = Date()
let v = try fn()
let end = Date()
return (end.timeIntervalSince(start), v)
}
let firstRes = try measureOne(fn).1 /* pre-heat and throw away */
var measurements = Array(repeating: 0.0, count: 10)
for i in 0..<10 {
let timeAndRes = try measureOne(fn)
measurements[i] = timeAndRes.0
precondition(firstRes == timeAndRes.1)
}
print(firstRes)
return measurements
}
public func measureAndPrint(desc: String, fn: () throws -> Int) rethrows -> Void {
print("measuring: \(desc): ", terminator: "")
let measurements = try measure(fn)
print(measurements.reduce("") { $0 + "\($1), " })
}
measureAndPrint(desc: "direct") {
var thing = SomeStruct()
for _ in 0..<100_000_000 {
thing.x += 1
}
return thing.x
}
measureAndPrint(desc: "fake key paths") {
var thing = SomeStruct()
for _ in 0..<100_000_000 {
let read = SomeStruct.fakeKeyPathsForX.readIt
let write = SomeStruct.fakeKeyPathsForX.writeIt
write(&thing, read(thing) + 1)
}
return thing.x
}
measureAndPrint(desc: "totally cheated fake key paths") {
var thing = SomeStruct()
for _ in 0..<100_000_000 {
let read = SomeStruct.fakeCheatedKeyPathsForX.readIt
let write = SomeStruct.fakeCheatedKeyPathsForX.writeIt
write(&thing, read(thing) + 1)
}
return thing.x
}
measureAndPrint(desc: "real key paths") {
var thing = SomeStruct()
for _ in 0..<100_000_000 {
thing[keyPath: \SomeStruct.x] += 1
}
return thing.x
}
// MODULE: Foo
public struct SomeStruct {
public var x: Int = 0
public init(){}
}
// compiler generated
// fake
public struct FakeWritableKeyPath<Thing, Element> {
public let writeIt: (inout Thing, Element) -> Void
public let readIt: (Thing) -> Element
}
extension SomeStruct {
public static let fakeKeyPathsForX: FakeWritableKeyPath<SomeStruct, Int> =
FakeWritableKeyPath(writeIt: { thing, newValue in thing.x = newValue },
readIt: { thing in return thing.x })
}
// cheat
public struct FakeCheatedWritableKeyPath {
public let writeIt: @convention(thin) (inout SomeStruct, Int) -> Void
public let readIt: @convention(thin) (SomeStruct) -> Int
}
extension SomeStruct {
public static let fakeCheatedKeyPathsForX: FakeCheatedWritableKeyPath =
FakeCheatedWritableKeyPath(writeIt: { thing, newValue in thing.x = newValue },
readIt: { thing in return thing.x })
}
```
and
Questions
Is there an expectation that keypaths significantly slower than functions or straight access with a closure?
If so, is there anyway to mitigate these discrepancies while still using keypaths?
Context
I'm noticing anywhere between an increased factor of 2-10+ when using keypaths as functions or using dynamic member lookup.
Program 1: Key paths as functions
import AppKit
func arrayClosureTest(_ size: UInt) {
let xs = (0 ... size).map(String.init)
let t0 = CACurrentMediaTi…
I understand the optimization of keypath handling is handled by a different set of priorities, I just wanted to point it out if there is anything that can be done to minimize that possible impact.
I guess it's only tangential to the pitch, but wanted to at least call it out as performance of evaluate()
is critical to the usability of Predicates (at least for us) - and the pitch overall really looks great and we'd be super happy to use it (as long as performance is ok).
So, anyway - big +1 for the pitch overall.
3 Likes