Here's the setup: I have a basic Pack
type that holds a parameter pack of arbitrary values. I also have a Packs
type that holds a parameter pack of other Pack
s. Everything is static:
protocol ValueHolding {
static func forEachValueTypeSize(body: (Int) -> Void)
static func forEachOptionalValueTypeSize(body: (Int) -> Void)
}
struct Pack<each Value>: ValueHolding {
static func size<T>(of type: T.Type) -> Int {
MemoryLayout<T>.size
}
static func forEachValueTypeSize(body: (Int) -> Void) {
for valueType in repeat (each Value).self {
body(size(of: valueType))
}
}
static func forEachOptionalValueTypeSize(body: (Int) -> Void) {
for optionalValueType in repeat Optional<each Value>.self {
body(size(of: optionalValueType))
}
}
}
struct Packs<each Value: ValueHolding>: ValueHolding {
static func forEachValueTypeSize(body: (Int) -> Void) {
for valueType in repeat (each Value).self {
valueType.forEachValueTypeSize(body: body)
}
}
static func forEachOptionalValueTypeSize(body: (Int) -> Void) {
for valueType in repeat (each Value).self {
valueType.forEachOptionalValueTypeSize(body: body)
}
}
}
Generally speaking, there seems to be two factors that influence the performance of these loops:
- The bodies of those
forEach
functions: some of them loop over the pack types as-is, and others wrap them in anOptional
. - The more nested a
Pack
is in generics, the worse the loop's performance is.
Some test code and benchmarks from my M3 Macbook Pro:
import os
let log = OSLog(subsystem: "com.example.packs", category: .pointsOfInterest)
let signposter = OSSignposter(logHandle: log)
func profile(_ name: StaticString, body: () -> Void) {
let staticSignpostID = signposter.makeSignpostID()
let state = signposter.beginInterval(name, id: staticSignpostID)
for _ in 0..<500_000 {
body()
}
signposter.endInterval(name, state)
}
var totalSize = 0
// Fast, but slower than it should be: 324.33 µs
profile("Pack.forEachOptionalValueTypeSize x 2") {
totalSize = 0
let pack = Pack<Int, String, Int, String>.self
pack.forEachOptionalValueTypeSize {
totalSize += $0
}
pack.forEachOptionalValueTypeSize {
totalSize += $0
}
}
// Extremely slow, relatively: 128.44 ms
profile("Packs.forEachOptionalValueTypeSize") {
totalSize = 0
let packs = Packs<
Pack<Int, String, Int, String>,
Pack<Int, String, Int, String>
>.self
packs.forEachOptionalValueTypeSize {
totalSize += $0
}
}
// Fast! 11.29 µs
profile("Pack.forEachValueTypeSize x 2") {
totalSize = 0
let pack = Pack<Int, String, Int, String>.self
pack.forEachValueTypeSize {
totalSize += $0
}
pack.forEachValueTypeSize {
totalSize += $0
}
}
// Pretty fast, but still slower than it should be: 8.52 ms
profile("Packs.forEachValueTypeSize") {
totalSize = 0
let packs = Packs<
Pack<Int, String, Int, String>,
Pack<Int, String, Int, String>
>.self
packs.forEachValueTypeSize {
totalSize += $0
}
}
print(totalSize)
Ranked, fastest to slowest:
Test | Duration |
---|---|
Pack.forEachValueTypeSize x 2 | 11.29 µs |
Pack.forEachOptionalValueTypeSize x 2 | 324.33 µs |
Packs.forEachValueTypeSize | 8.52 ms |
Packs.forEachOptionalValueTypeSize | 128.44 ms |
Looping over the unmodified types is fastest; but generally the "further" we get from the pack the slower the loop gets, with the nested, wrapped packs performing the worst.
Looking at the profile for these seems to suggest to me that at a certain point Swift stops being able to fully statically generate and specialize the types and calls involved, and starts leaning on runtime lookups. This gets pretty terrible if you combine both nested types that have parameter packs and wrapping of whatever the original types were.
Anyway, I consider this a bug (or at least an opportunity for improvement) so I plan to file one, but I'm curious in a more general sense so as to refine my mental model: is there any principle that should cause this code to progressive slow down the way it does?