Generics inlining

(Not an expert, but I'll try to answer your question! There may be parts of this where I'm slightly off the mark.)

There's a quote about @inline(__always) that I think is relevant here:

I mention that quote because for @inline(__always) to do anything meaningful here, there would have to be a reason for the compiler to decide "nah, I don't want to inline this", and the compiler is pretty smart most of the time and does decide to inline at the right times. That is to say: you may not need @inline(__always) at all to begin with.

But to answer your question; when you call:

foo.start(with: AVCaptureSession(), on: Pipe())

the compiler will:

  1. inline a version of foo.start(with:on:) that skips your dynamic cast to AVCaptureSession, because in this inlined and specialized version it knows P will always be AVCaptureSession: no casting needed, and
  2. Assigns nil to sessionPreset, because it knows that your second dynamic cast will always fail.

We can verify this by looking at the generated assembly:

class AVCaptureSession {

    func beginConfiguration() {
        print("beginning configuration")
    }

}

@inline(__always)
func start<P>(with piped: P, on pipe: Pipe) {
    let session = piped as? AVCaptureSession ?? pipe.get()
    session.beginConfiguration()

    let preset = piped as? AVCaptureSession.Preset  
    session.sessionPreset = preset
}

func testCallWithConcreteType() {
    start(with: AVCaptureSession(), on: Pipe())
}
output.testCallWithConcreteType() -> ():
        push    rbx
        lea     rdi, [rip + (demangling cache variable for type metadata for Swift._ContiguousArrayStorage<Any>)]
        call    __swift_instantiateConcreteTypeFromMangledName
        mov     esi, 64
        mov     edx, 7
        mov     rdi, rax
        call    swift_allocObject@PLT
        mov     rbx, rax
        mov     qword ptr [rax + 16], 1
        mov     qword ptr [rax + 24], 2
        mov     rax, qword ptr [rip + ($sSSN)@GOTPCREL]
        mov     qword ptr [rbx + 56], rax
        movabs  rax, -3458764513820540905
        mov     qword ptr [rbx + 32], rax
        lea     rax, [rip + ".L.str.23.beginning configuration"-32]
        movabs  rcx, -9223372036854775808
        or      rcx, rax
        mov     qword ptr [rbx + 40], rcx
        movabs  rdx, -2233785415175766016
        mov     esi, 32
        mov     ecx, 10
        mov     rdi, rbx
        mov     r8, rdx
        call    ($ss5print_9separator10terminatoryypd_S2StF)@PLT
        mov     rdi, rbx
        pop     rbx
        jmp     swift_release@PLT

Note that the compiler doesn't waste any time casting. It just calls an inlined print("beginning configuration") call and makes an assignment; it can see the types involved and knows it doesn't need to do any casting.

Compare that with the non-inlined and non-specialized implementation, which contains both casts and jumps to blocks depending on whether your original casts succeed or not (reduced for brevity):

output.start<A>(with: A, on: output.Pipe) -> ():
        // ...
        call    swift_dynamicCast@PLT
        test    al, al
        je      .LBB16_2
        mov     r13, qword ptr [rbp - 56]
        jmp     .LBB16_3
.LBB16_2:
        lea     rdi, [rip + (full type metadata for output.AVCaptureSession)+24]
        mov     esi, 17
        mov     edx, 7
        call    swift_allocObject@PLT
        mov     r13, rax
        mov     byte ptr [rax + 16], 1
.LBB16_3:
        // ...
        call    swift_dynamicCast@PLT
        xor     al, 1
        mov     rcx, qword ptr [r13]
        movzx   edi, al
        call    qword ptr [rcx + 72]
        mov     rdi, r13
        call    swift_release@PLT
        //  ...

To summarize:

  1. If you tell the compiler to always inline and it can, it will, but you may not need to tell it to inline unless that's really really what you want it to do.
  2. The compiler won't just inline it the function, but it'll usually specialize it and generate a version based on the known, concrete types that function is called with that skips any dynamic as? or as! casting you were doing.
  3. We didn't touch on this here but there are also rules around visibility that determine whether or not the compiler can inline and specialize something. That usually comes into play when calls are being made across modules.

At the end of the day, the only way to be really sure though is to test your given case and see what the compiler does.

5 Likes