I was working on a function and realized it would be more readable if I moved some repeated code into a local helper function. I expected the helper function to be inlined and optimized away, but testing showed that it made things 10% slower.
Here is a simplified example, where foo and bar do the same thing but bar uses a local helper function and foo does not:
func foo() -> [Int] {
var x = Array(repeating: 0, count: 4)
for _ in 0 ..< 100 {
let i = Int.random(in: x.indices)
x[i] += 1
}
return x
}
func bar() -> [Int] {
var x = Array(repeating: 0, count: 4)
func update(_ i: Int) { x[i] += 1 }
for _ in 0 ..< 100 {
let i = Int.random(in: x.indices)
update(i)
}
return x
}
Looking on Godbolt, we see that compiling with -O optimization produces 93 lines of assembly for foo, and 230 lines for bar. That’s more than double the assembly code for something that I thought would be transparent to the compiler.
Is this a known issue, and is there any way to fix it?
Yeah I was just about to post this. I agree with @Nevin though — it makes sense that the compiler should be able to inline update() into foo in a way that's identical the non-helper function version.
On another note: one thing I notice is that the even though the inout version generates assembly basically identical to the non-inout version, the compiler doesn't merge and reuse those implementations.
Yeah, the compiler doesn't seem to inline non-escaping closures with captures very well.
Yeah, the compiled code is exactly the same, so it's really odd that it isn't deduplicated. I would have expected one of them to just jmp to the other if both are defined.