i’ve noticed some pretty strange performance behavior with the
@inline(_:) attributes. namely, i have a function which accounts for a majority of the benchmark run time, with the following signature:
public func unpack<Color>(as _:Color.Type) -> [Color] where Color:PNG.Color
this function is just a wrapper function that calls a non-generic member function required by the
i have a copy of the benchmarking code inside the module (
PNG) that this function lives, and a copy in a separate target which calls this function from across the module boundary.
as expected, the benchmarks run much, much slower when they call this function from outside the
from outside PNG module: 67.768 ms from inside PNG module: 3.968 ms
so i tried to fix this by adding an
@inlinable public func unpack<Color>(as _:Color.Type) -> [Color] where Color:PNG.Color
but that didn’t help at all:
from outside PNG module: 71.931 ms from inside PNG module: 3.960 ms
so i tried to isolate the issue by adding an
@inline(never) attribute on top of the
@inlinable @inline(never) public func unpack<Color>(as _:Color.Type) -> [Color] where Color:PNG.Color
incredibly, this solved the issue:
from outside PNG module: 4.114 ms from inside PNG module: 4.241 ms
now i’m wondering why
@inline(never) even has such a drastic impact on performance. if the function body and specializations are already available to the compiler, shouldn’t the compiler be able to make the decision to inline (into the benchmarking code) on its own?
i suspect the compiler’s default behavior is to inline the body of the
unpack(as:) function (which contains another generic function call) into the benchmarking code, which undoes the effect of the
@inlinable attribute, since it just turns back into another generic call across a module boundary. if that’s the case, why would the compiler inline the body without replacing the inner generic call with a specialized function call?