I've found a workaround. It's super ugly...
@inlinable @inline(__always)
internal var schemeKind: Optional<WebURL.SchemeKind> {
withUnsafeBytes(of: self) {
$0.load(fromByteOffset: MemoryLayout.offset(of: \Self._schemeKind)!, as: Optional<WebURL.SchemeKind>.self)
}
}
Basically I renamed the stored properties and wrote my own accessors for them. I did this for like 3 or 4 stored properties, all optionals. I really, really wish I didn't have to do this, but now my benchmarks are showing the best results I have ever seen from this project, with common cases improving by 17-24%(!!!).
Summary
benchmark column results/path_9 results/path_10 %
--------------------------------------------------------------------------------------------------------
Constructor.SpecialNonFile.AverageURLs std 26.08 63.28 -142.65
Constructor.SpecialNonFile.AverageURLs warmup 3829983.00 3466338.00 9.49
Constructor.SpecialNonFile.AverageURLs iterations 37796.00 44360.00 -17.37
Constructor.SpecialNonFile.AverageURLs time 33922.00 28055.00 17.30
Constructor.SpecialNonFile.AverageURLs filtered std 35.28 33.86 4.03
Constructor.SpecialNonFile.AverageURLs filtered warmup 4742413.00 5451983.00 -14.96
Constructor.SpecialNonFile.AverageURLs filtered iterations 28011.00 28872.00 -3.07
Constructor.SpecialNonFile.AverageURLs filtered time 45356.00 43094.50 4.99
Constructor.SpecialNonFile.IPv4 host std 35.11 34.60 1.45
Constructor.SpecialNonFile.IPv4 host warmup 3577949.00 2827990.00 20.96
Constructor.SpecialNonFile.IPv4 host iterations 36648.00 42892.00 -17.04
Constructor.SpecialNonFile.IPv4 host time 35112.00 27630.00 21.31
Constructor.SpecialNonFile.IPv4 host filtered std 50.86 36.97 27.31
Constructor.SpecialNonFile.IPv4 host filtered warmup 4221503.00 3779503.00 10.47
Constructor.SpecialNonFile.IPv4 host filtered iterations 32093.00 33531.00 -4.48
Constructor.SpecialNonFile.IPv4 host filtered time 39854.00 36601.00 8.16
Constructor.SpecialNonFile.IPv6 host std 32.64 37.21 -14.00
Constructor.SpecialNonFile.IPv6 host warmup 4345729.00 3213953.00 26.04
Constructor.SpecialNonFile.IPv6 host iterations 32983.00 44068.00 -33.61
Constructor.SpecialNonFile.IPv6 host time 37398.00 28309.50 24.30
Constructor.SpecialNonFile.IPv6 host filtered std 24.86 38.78 -56.02
Constructor.SpecialNonFile.IPv6 host filtered warmup 5475037.00 3436541.00 37.23
Constructor.SpecialNonFile.IPv6 host filtered iterations 30296.00 37909.00 -25.13
Constructor.SpecialNonFile.IPv6 host filtered time 41760.00 34307.00 17.85
Constructor.SpecialNonFile.Percent-encoding components std 37.57 32.74 12.86
Constructor.SpecialNonFile.Percent-encoding components warmup 1278889.00 1141999.00 10.70
Constructor.SpecialNonFile.Percent-encoding components iterations 104477.00 119079.00 -13.98
Constructor.SpecialNonFile.Percent-encoding components time 12318.00 11108.00 9.82
Constructor.SpecialNonFile.Percent-encoded hostnames std 104.19 37.64 63.87
Constructor.SpecialNonFile.Percent-encoded hostnames warmup 1214436.00 1039024.00 14.44
Constructor.SpecialNonFile.Percent-encoded hostnames iterations 109287.00 131662.00 -20.47
Constructor.SpecialNonFile.Percent-encoded hostnames time 11629.00 9915.00 14.74
Constructor.SpecialNonFile.Long paths std 24.11 25.35 -5.14
Constructor.SpecialNonFile.Long paths warmup 10246773.00 9702339.00 5.31
Constructor.SpecialNonFile.Long paths iterations 12921.00 13470.00 -4.25
Constructor.SpecialNonFile.Long paths time 94723.00 93464.50 1.33
Constructor.SpecialNonFile.Complex paths 1 std 47.32 28.06 40.70
Constructor.SpecialNonFile.Complex paths 1 warmup 2991180.00 2786803.00 6.83
Constructor.SpecialNonFile.Complex paths 1 iterations 47141.00 48935.00 -3.81
Constructor.SpecialNonFile.Complex paths 1 time 26821.00 26165.00 2.45
Constructor.SpecialNonFile.Complex paths 2 std 35.28 26.18 25.79
Constructor.SpecialNonFile.Complex paths 2 warmup 3751616.00 3253327.00 13.28
Constructor.SpecialNonFile.Complex paths 2 iterations 37345.00 41286.00 -10.55
Constructor.SpecialNonFile.Complex paths 2 time 32966.00 32142.00 2.50
Constructor.SpecialNonFile.Long query 1 std 194.64 70.05 64.01
Constructor.SpecialNonFile.Long query 1 warmup 362005.00 328641.00 9.22
Constructor.SpecialNonFile.Long query 1 iterations 407376.00 476298.00 -16.92
Constructor.SpecialNonFile.Long query 1 time 3089.00 2738.00 11.36
Constructor.SpecialNonFile.Long query 2 std 34.20 30.88 9.71
Constructor.SpecialNonFile.Long query 2 warmup 2504260.00 2733403.00 -9.15
Constructor.SpecialNonFile.Long query 2 iterations 52335.00 54263.00 -3.68
Constructor.SpecialNonFile.Long query 2 time 24316.00 23348.00 3.98
--------------------------------------------------------------------------------------------------------
std 14.02
warmup 11.72
iterations -13.05
time 11.09
Wow. I'm ecstatic. I can't begin to describe the countless hours I've spent, trying to figure out which algorithmic changes I could make, so I could bring these benchmarks anywhere close to this level. In particular the first one ("AverageURLs") - it's the most important, the most representative of real URLs on the web, and I'd set myself the goal of getting it under 30ms. But no matter how much I tried, I just couldn't get it reliably under that level.
With one change, it has steam-rolled past that. As you can imagine, I've ran them all again multiple times to confirm. The best I've seen is 27.6, and the worst is 28.8. It's an arbitrary benchmark and there's random variance and all that, but still - it's huge. Even now I can't really believe I've seen those numbers.
We need this fixed, badly. I hate to think how many other developers are out there, wondering why their code isn't meeting their performance targets, spending who-knows-how-much time trying to think of clever tricks or novel algorithms, when there are things like this lurking in the compiler.