What is `___chkstk_darwin`?

I need to optimize my application and in the process of profiling it using Instruments — and after "Inverting the Call Tree" — I discovered the most used construct is ___chkstk_darwin. I didn't see anything that cleared up what it was after searching. Can anyone explain to me what ___chkstk_darwin is?

Stack checking, to catch stack overflows. You can disable it by passing -fno-stack-check if necessary.

1 Like

further reading: Security Technologies: Stack Smashing Protection (StackGuard)

1 Like

If this is an iOS application compiled by Xcode, where would I place the -fno-stack-check flag?

Build Settings > Other C Flags.

After setting the flag it's still showing up in the call stack in Time Profiler. Do you know if the flag needs to also be set in the Build Settings of each module used? And somehow set for each SPM used?

"Other C Flags" will only affect C/C++/ObjC compilation. There isn't any equivalent flag (that I know of) for the Swift compiler. -Xclang -fno-stack-check might do the trick in "Other Swift Flags", but no guarantees. Stack checks in Swift often arise from the dynamic allocations that we emit for unspecialized generic code, so if you're seeing it creep into your hot paths, there might be something inadvertently blocking specialization?

1 Like

This function is using ___chkstk_darwin the most:

extension Array where Element == UInt {

    ///    Get bit i.
    func getBit(at i: Int) -> Bool {
        let limbIndex = Int(UInt(i) >> 6)
        
        if limbIndex >= self.count { return false }
        
        let bitIndex = UInt(i) & 0b111_111
        
        return (self[limbIndex] & (1 << bitIndex)) != 0
    }
}

Does anyone know what about it is causing it to be used?

Some of those operations are generic, so if you're building with optimization disabled, we may be calling out to their unspecialized generic implementations. If I build with optimization, I don't see any __chkstk calls in the optimized assembly on macOS x86_64:

_$sSa3fooSuRszlE6getBit2atSbSi_tF:
        pushq   %rbp
        movq    %rsp, %rbp
        testq   %rdi, %rdi
        js      LBB1_5
        movq    %rdi, %rax
        shrq    $6, %rax
        cmpq    16(%rsi), %rax
        jae     LBB1_3
        movq    32(%rsi,%rax,8), %rax
        btq     %rdi, %rax
        setb    %al
        popq    %rbp
        retq
LBB1_3:
        xorl    %eax, %eax
        popq    %rbp
        retq
LBB1_5:
        ## InlineAsm Start
        ## InlineAsm End
        ud2
2 Likes

The iOS app has -Os (Fastest, smallest) set for the Apple Clang - Code Generation Optimization Level setting, and it has -O (Optimize for Speed) set for the Swift Compiler - Code Generation Optimization Level setting; and it's still showing the __chkstk calls.

The library that's making the calls is actually a SPM dependency though. Does there need to be optimization settings set for it specifically in order for optimizations to apply?

That could be it, since we still don't currently do cross-module optimization by default. You could try making the function in the SPM dependency @inlinable so that it can be inlined and specialized into the client project.

Does this only mean that the optimization settings set in the iOS app's build settings will not apply to the dependency, or does it mean that plus even if there are optimization settings set in the SPM dependency they too will also not be applied?

Your optimization settings should apply to all packages independently. Without cross-module optimization or @inlinable, though, we won't do optimizations across package boundaries, such as inlining a call to a small helper function in a package, or specializing a generic call in another package.

Are you referring to self.count and self[limbIndex]?

For reference:

extension Array where Element == UInt {

    ///    Get bit i.
    func getBit(at i: Int) -> Bool {
        let limbIndex = Int(UInt(i) >> 6)
        
        if limbIndex >= self.count { return false }
        
        let bitIndex = UInt(i) & 0b111_111
        
        return (self[limbIndex] & (1 << bitIndex)) != 0
    }
}

Hm. Those both come from the standard library, so they should be inlinable and specializable when used from any package. It might be possible that the package containing the getBit definition is in fact not being built with optimization, though I'm not sure why that would be the case.

Found some evidence that compiling in debug mode might result in optimizations being off for SPM dependencies. After changing the run scheme for the iOS app to run in release mode, all performance issues disappeared.

2 Likes