Generic specialization when satisfying a concrete protocol requirement with generics

@Karl's response is the right one here. While I understand the instinct to ask this question, many of the answerers will not have the answer off the top of their head and will have to investigate it anyway.

For posterity's sake, the answer appears to be "yes", which was a fairly surprising result to me. I produced a simple Compiler Explorer example, and in the compiled output you can observe that the protocol witness (protocol witness for output.P.foo(x: [Swift.UInt8]) -> Swift.Int in conformance output.S : output.P in output) is not simply calling the generic function, but instead has specialised for the specific case. This implies that we shouldn't have a performance issue: the protocol witness is already specialised for the concrete case.

However, this optimisation can be defeated by the optimiser. I generalised your example to this:

Module B:

public protocol P {
    func foo(x: [UInt8]) -> Int
}

public struct S: P {
    public init() { }

    public func foo(x: some RandomAccessCollection<UInt8>) -> Int {
        var value = 0
        for element in x {
            value += Int(element)
        }
        return value
    }
}

Main:

import B

@main
struct Main {
    static func main() {
        let s = S()
        print(Self.computeValue(s, x: [1, 2, 3, 4]))
    }

    static func computeValue<F: P>(_ f: F, x: [UInt8]) -> Int {
        return f.foo(x: x)
    }
}

Much to my surprise, the result was that we specialised computeValue for S and removed the call to the protocol witness, instead calling the underlying function directly:

add        x0, x0, #0x258               ; 0xe1258@PAGEOFF, argument #1 for method __swift_instantiateConcreteTypeFromMangledName, $sSays5UInt8VGMD
bl         __swift_instantiateConcreteTypeFromMangledName ; __swift_instantiateConcreteTypeFromMangledName
mov        x20, x0
bl         $sSays5UInt8VGSayxGSksWl     ; lazy protocol witness table accessor for type [Swift.UInt8] and conformance [A] : Swift.RandomAccessCollection in Swift
mov        x2, x0                       ; argument #3 for method $s7Library1SV3foo1xSix_tSkRzs5UInt8V7ElementRtzlF
add        x0, sp, #0x8                 ; argument #1 for method $s7Library1SV3foo1xSix_tSkRzs5UInt8V7ElementRtzlF
mov        x1, x20                      ; argument #2 for method $s7Library1SV3foo1xSix_tSkRzs5UInt8V7ElementRtzlF
bl         $s7Library1SV3foo1xSix_tSkRzs5UInt8V7ElementRtzlF ; Library.S.foo<A where A: Swift.RandomAccessCollection, A.Element == Swift.UInt8>(x: A) -> Swift.Int

Note the call to the unspecialised generic function here.

This produced an unusual outcome: the specialized generic function performed worse then the unspecialised one. Adding a non-inlinable generic function to B (or to a new module, A) that is exactly the same as the one in main produces a call to the protocol witness instead, which performs vastly better.

I've filed this as Generic specialization can replace calls to specialised functions with calls to unspecialised ones · Issue #62229 · apple/swift · GitHub.

TL;DR: Make the witness inlinable, to be safe.

3 Likes