Generic specialization when satisfying a concrete protocol requirement with generics

are there any performance pitfalls that come with satisfying a concrete protocol requirement such as

public
protocol P
{
    func foo(x:ByteBufferView)
}

with a generic that is not @inlinable?

import PModule

extension S:P
{
    public
    func foo(x:some RandomAccessCollection<UInt8>)
    {
    }
}

Did you benchmark it?

1 Like

@Karl's response is the right one here. While I understand the instinct to ask this question, many of the answerers will not have the answer off the top of their head and will have to investigate it anyway.

For posterity's sake, the answer appears to be "yes", which was a fairly surprising result to me. I produced a simple Compiler Explorer example, and in the compiled output you can observe that the protocol witness (protocol witness for output.P.foo(x: [Swift.UInt8]) -> Swift.Int in conformance output.S : output.P in output) is not simply calling the generic function, but instead has specialised for the specific case. This implies that we shouldn't have a performance issue: the protocol witness is already specialised for the concrete case.

However, this optimisation can be defeated by the optimiser. I generalised your example to this:

Module B:

public protocol P {
    func foo(x: [UInt8]) -> Int
}

public struct S: P {
    public init() { }

    public func foo(x: some RandomAccessCollection<UInt8>) -> Int {
        var value = 0
        for element in x {
            value += Int(element)
        }
        return value
    }
}

Main:

import B

@main
struct Main {
    static func main() {
        let s = S()
        print(Self.computeValue(s, x: [1, 2, 3, 4]))
    }

    static func computeValue<F: P>(_ f: F, x: [UInt8]) -> Int {
        return f.foo(x: x)
    }
}

Much to my surprise, the result was that we specialised computeValue for S and removed the call to the protocol witness, instead calling the underlying function directly:

add        x0, x0, #0x258               ; 0xe1258@PAGEOFF, argument #1 for method __swift_instantiateConcreteTypeFromMangledName, $sSays5UInt8VGMD
bl         __swift_instantiateConcreteTypeFromMangledName ; __swift_instantiateConcreteTypeFromMangledName
mov        x20, x0
bl         $sSays5UInt8VGSayxGSksWl     ; lazy protocol witness table accessor for type [Swift.UInt8] and conformance [A] : Swift.RandomAccessCollection in Swift
mov        x2, x0                       ; argument #3 for method $s7Library1SV3foo1xSix_tSkRzs5UInt8V7ElementRtzlF
add        x0, sp, #0x8                 ; argument #1 for method $s7Library1SV3foo1xSix_tSkRzs5UInt8V7ElementRtzlF
mov        x1, x20                      ; argument #2 for method $s7Library1SV3foo1xSix_tSkRzs5UInt8V7ElementRtzlF
bl         $s7Library1SV3foo1xSix_tSkRzs5UInt8V7ElementRtzlF ; Library.S.foo<A where A: Swift.RandomAccessCollection, A.Element == Swift.UInt8>(x: A) -> Swift.Int

Note the call to the unspecialised generic function here.

This produced an unusual outcome: the specialized generic function performed worse then the unspecialised one. Adding a non-inlinable generic function to B (or to a new module, A) that is exactly the same as the one in main produces a call to the protocol witness instead, which performs vastly better.

I've filed this as Generic specialization can replace calls to specialised functions with calls to unspecialised ones · Issue #62229 · apple/swift · GitHub.

TL;DR: Make the witness inlinable, to be safe.

3 Likes

thanks for the investigation, this is a lot to keep in mind.

how did you godbolt a two-module use case? one of the reasons i find it so hard to apply @inlinable, @frozen etc across multi-module projects is i don’t know how to inspect the optimizations the compiler is able to make the way i can with a single-file example.

i’m confused by this sentence, do you mean re-creating Main in module B?

Module B:

struct Main
{
    public static 
    func computeValue(_ f:some P, x:[UInt8]) -> Int
    {
        return f.foo(x: x)
    }
}

why would this impact the compiler decision-making?

I didn't. I created a Swift package, built the binary, and disassembled it. This is the single most effective way to answer the question "what did the compiler do": look at what it did.

Because the compiler cannot specialise the function when it compiles the main executable. This forces us to call the unspecialised generic, which calls the protocol witness, which as we saw is specialised.

3 Likes