What is `___chkstk_darwin`?

I need to optimize my application and in the process of profiling it using Instruments — and after "Inverting the Call Tree" — I discovered the most used construct is ___chkstk_darwin. I didn't see anything that cleared up what it was after searching. Can anyone explain to me what ___chkstk_darwin is?

1 Like

Stack checking, to catch stack overflows. You can disable it by passing -fno-stack-check if necessary.

1 Like

further reading: Security Technologies: Stack Smashing Protection (StackGuard)

2 Likes

If this is an iOS application compiled by Xcode, where would I place the -fno-stack-check flag?

Build Settings > Other C Flags.

After setting the flag it's still showing up in the call stack in Time Profiler. Do you know if the flag needs to also be set in the Build Settings of each module used? And somehow set for each SPM used?

"Other C Flags" will only affect C/C++/ObjC compilation. There isn't any equivalent flag (that I know of) for the Swift compiler. -Xclang -fno-stack-check might do the trick in "Other Swift Flags", but no guarantees. Stack checks in Swift often arise from the dynamic allocations that we emit for unspecialized generic code, so if you're seeing it creep into your hot paths, there might be something inadvertently blocking specialization?

2 Likes

This function is using ___chkstk_darwin the most:

extension Array where Element == UInt {

    ///    Get bit i.
    func getBit(at i: Int) -> Bool {
        let limbIndex = Int(UInt(i) >> 6)
        
        if limbIndex >= self.count { return false }
        
        let bitIndex = UInt(i) & 0b111_111
        
        return (self[limbIndex] & (1 << bitIndex)) != 0
    }
}

Does anyone know what about it is causing it to be used?

Some of those operations are generic, so if you're building with optimization disabled, we may be calling out to their unspecialized generic implementations. If I build with optimization, I don't see any __chkstk calls in the optimized assembly on macOS x86_64:

_$sSa3fooSuRszlE6getBit2atSbSi_tF:
        pushq   %rbp
        movq    %rsp, %rbp
        testq   %rdi, %rdi
        js      LBB1_5
        movq    %rdi, %rax
        shrq    $6, %rax
        cmpq    16(%rsi), %rax
        jae     LBB1_3
        movq    32(%rsi,%rax,8), %rax
        btq     %rdi, %rax
        setb    %al
        popq    %rbp
        retq
LBB1_3:
        xorl    %eax, %eax
        popq    %rbp
        retq
LBB1_5:
        ## InlineAsm Start
        ## InlineAsm End
        ud2
3 Likes

The iOS app has -Os (Fastest, smallest) set for the Apple Clang - Code Generation Optimization Level setting, and it has -O (Optimize for Speed) set for the Swift Compiler - Code Generation Optimization Level setting; and it's still showing the __chkstk calls.

The library that's making the calls is actually a SPM dependency though. Does there need to be optimization settings set for it specifically in order for optimizations to apply?

That could be it, since we still don't currently do cross-module optimization by default. You could try making the function in the SPM dependency @inlinable so that it can be inlined and specialized into the client project.

Does this only mean that the optimization settings set in the iOS app's build settings will not apply to the dependency, or does it mean that plus even if there are optimization settings set in the SPM dependency they too will also not be applied?

Your optimization settings should apply to all packages independently. Without cross-module optimization or @inlinable, though, we won't do optimizations across package boundaries, such as inlining a call to a small helper function in a package, or specializing a generic call in another package.

Are you referring to self.count and self[limbIndex]?

For reference:

extension Array where Element == UInt {

    ///    Get bit i.
    func getBit(at i: Int) -> Bool {
        let limbIndex = Int(UInt(i) >> 6)
        
        if limbIndex >= self.count { return false }
        
        let bitIndex = UInt(i) & 0b111_111
        
        return (self[limbIndex] & (1 << bitIndex)) != 0
    }
}

Hm. Those both come from the standard library, so they should be inlinable and specializable when used from any package. It might be possible that the package containing the getBit definition is in fact not being built with optimization, though I'm not sure why that would be the case.

Found some evidence that compiling in debug mode might result in optimizations being off for SPM dependencies. After changing the run scheme for the iOS app to run in release mode, all performance issues disappeared.

2 Likes

If Google pointed you here (like myself) and you have a similar issue with ___chkstk_darwin in release mode, try the -no-stack-check Swift compiler flag. Please notice there is no f in the flag name. As well as no need in -Xclang flag.
The flag could be used in Xcode "Other Swift Flags" (OTHER_SWIFT_FLAGS) or in SwiftSetting.unsafeFlags for an SPM target, for example

.executableTarget(name: "MyTarget", swiftSettings: [.unsafeFlags(["-no-stack-check"])])

I've tested the flag helps on Xcode 14.3 (14E222b) and Swift compiler v5.8 (swiftlang-5.8.0.124.2 clang-1403.0.22.11.100).
BTW, @Joe_Groff in my case the ___chkstk_darwin appears in a hot path under the UnsafeRawBufferPointer.loadUnaligned<T>(fromByteOffset:as:) call. Please let me know if I should report that issue separately.

In debug or in release?

In release. I tried both -O and -Ounchecked and ___chkstk_darwin is still there. -no-stack-check blows it away.

Definitely worth a bug report if you can attach a full example to reproduce it.

1 Like