around 4 or 5 years ago i discovered underspecialization was responsible for about a factor 10x slowdown in swift-png, because its pixel types are generic over FixedWidthInteger & UnsignedInteger
to handle varying color depth.
about a year ago i discovered a very similar issue in swift-json, because its parser is generic over RandomAccessCollection<UInt8>
.
you can also diagnose this issue directly with profiling tools like perf, by looking for clumps of "type metadata"-related samples.
but more often i discover it because i add a public convenience API somewhere else that uses some concrete type like [UInt8]
, and all of a sudden the generic benchmarks run 10x faster because the compiler generates a specialization for the convenience API to use, and the benchmarks all start calling the specialized function instead of the unspecialized one.
as a side note, even @inlinable
isn't enough sometimes to fix this problem. sometimes even the mere presence of generics from outer scopes can cause major performance changes when nesting and unnesting types from namespaces, see Un-nesting a type from an enum namespace results in a 6x slowdown