What is `___chkstk_darwin`?

chrisbia · August 8, 2022, 3:42pm

I need to optimize my application and in the process of profiling it using Instruments — and after "Inverting the Call Tree" — I discovered the most used construct is ___chkstk_darwin. I didn't see anything that cleared up what it was after searching. Can anyone explain to me what ___chkstk_darwin is?

suyashsrijan · August 8, 2022, 3:49pm

Stack checking, to catch stack overflows. You can disable it by passing -fno-stack-check if necessary.

AlexanderM · August 8, 2022, 3:54pm

further reading: Security Technologies: Stack Smashing Protection (StackGuard)

chrisbia · August 8, 2022, 3:55pm

If this is an iOS application compiled by Xcode, where would I place the -fno-stack-check flag?

suyashsrijan · August 8, 2022, 4:02pm

Build Settings > Other C Flags.

chrisbia · August 8, 2022, 4:16pm

After setting the flag it's still showing up in the call stack in Time Profiler. Do you know if the flag needs to also be set in the Build Settings of each module used? And somehow set for each SPM used?

Joe_Groff · August 8, 2022, 4:18pm

"Other C Flags" will only affect C/C++/ObjC compilation. There isn't any equivalent flag (that I know of) for the Swift compiler. -Xclang -fno-stack-check might do the trick in "Other Swift Flags", but no guarantees. Stack checks in Swift often arise from the dynamic allocations that we emit for unspecialized generic code, so if you're seeing it creep into your hot paths, there might be something inadvertently blocking specialization?

chrisbia · August 8, 2022, 4:26pm

This function is using ___chkstk_darwin the most:

extension Array where Element == UInt {

    ///    Get bit i.
    func getBit(at i: Int) -> Bool {
        let limbIndex = Int(UInt(i) >> 6)
        
        if limbIndex >= self.count { return false }
        
        let bitIndex = UInt(i) & 0b111_111
        
        return (self[limbIndex] & (1 << bitIndex)) != 0
    }
}

Does anyone know what about it is causing it to be used?

Joe_Groff · August 8, 2022, 4:40pm

Some of those operations are generic, so if you're building with optimization disabled, we may be calling out to their unspecialized generic implementations. If I build with optimization, I don't see any __chkstk calls in the optimized assembly on macOS x86_64:

_$sSa3fooSuRszlE6getBit2atSbSi_tF:
        pushq   %rbp
        movq    %rsp, %rbp
        testq   %rdi, %rdi
        js      LBB1_5
        movq    %rdi, %rax
        shrq    $6, %rax
        cmpq    16(%rsi), %rax
        jae     LBB1_3
        movq    32(%rsi,%rax,8), %rax
        btq     %rdi, %rax
        setb    %al
        popq    %rbp
        retq
LBB1_3:
        xorl    %eax, %eax
        popq    %rbp
        retq
LBB1_5:
        ## InlineAsm Start
        ## InlineAsm End
        ud2

chrisbia · August 8, 2022, 6:12pm

The iOS app has -Os (Fastest, smallest) set for the Apple Clang - Code Generation Optimization Level setting, and it has -O (Optimize for Speed) set for the Swift Compiler - Code Generation Optimization Level setting; and it's still showing the __chkstk calls.

The library that's making the calls is actually a SPM dependency though. Does there need to be optimization settings set for it specifically in order for optimizations to apply?

Joe_Groff · August 8, 2022, 6:13pm

That could be it, since we still don't currently do cross-module optimization by default. You could try making the function in the SPM dependency @inlinable so that it can be inlined and specialized into the client project.

chrisbia · August 8, 2022, 6:29pm

Does this only mean that the optimization settings set in the iOS app's build settings will not apply to the dependency, or does it mean that plus even if there are optimization settings set in the SPM dependency they too will also not be applied?

Joe_Groff · August 8, 2022, 6:31pm

Your optimization settings should apply to all packages independently. Without cross-module optimization or @inlinable, though, we won't do optimizations across package boundaries, such as inlining a call to a small helper function in a package, or specializing a generic call in another package.

chrisbia · August 8, 2022, 6:36pm

Are you referring to self.count and self[limbIndex]?

For reference:

extension Array where Element == UInt {

    ///    Get bit i.
    func getBit(at i: Int) -> Bool {
        let limbIndex = Int(UInt(i) >> 6)
        
        if limbIndex >= self.count { return false }
        
        let bitIndex = UInt(i) & 0b111_111
        
        return (self[limbIndex] & (1 << bitIndex)) != 0
    }
}

Joe_Groff · August 8, 2022, 6:42pm

Hm. Those both come from the standard library, so they should be inlinable and specializable when used from any package. It might be possible that the package containing the getBit definition is in fact not being built with optimization, though I'm not sure why that would be the case.

chrisbia · August 8, 2022, 7:48pm

Found some evidence that compiling in debug mode might result in optimizations being off for SPM dependencies. After changing the run scheme for the iOS app to run in release mode, all performance issues disappeared.

evnik · July 25, 2023, 7:12pm

If Google pointed you here (like myself) and you have a similar issue with ___chkstk_darwin in release mode, try the -no-stack-check Swift compiler flag. Please notice there is no f in the flag name. As well as no need in -Xclang flag.
The flag could be used in Xcode "Other Swift Flags" (OTHER_SWIFT_FLAGS) or in SwiftSetting.unsafeFlags for an SPM target, for example

.executableTarget(name: "MyTarget", swiftSettings: [.unsafeFlags(["-no-stack-check"])])

I've tested the flag helps on Xcode 14.3 (14E222b) and Swift compiler v5.8 (swiftlang-5.8.0.124.2 clang-1403.0.22.11.100).
BTW, @Joe_Groff in my case the ___chkstk_darwin appears in a hot path under the UnsafeRawBufferPointer.loadUnaligned<T>(fromByteOffset:as:) call. Please let me know if I should report that issue separately.

scanon · July 25, 2023, 7:36pm

In debug or in release?

evnik · July 25, 2023, 7:58pm

In release. I tried both -O and -Ounchecked and ___chkstk_darwin is still there. -no-stack-check blows it away.

scanon · July 25, 2023, 8:23pm

Definitely worth a bug report if you can attach a full example to reproduce it.