We recently saw an EXC_BAD_ACCESS crash in our app that was very difficult to assess from the stack trace. The top several calls were referencing the swift demangler:
#0 0x0000000185874a14 in swift::Demangle::__runtime::Demangler::DemangleInitRAII::DemangleInitRAII(swift::Demangle::__runtime::Demangler&, __swift::__runtime::llvm::StringRef, std::__1::function<swift::Demangle::__runtime::Node* (swift::Demangle::__runtime::SymbolicReferenceKind, swift::Demangle::__runtime::Directness, int, void const*)>) ()
#1 0x0000000185873d0c in swift::Demangle::__runtime::Demangler::demangleType(__swift::__runtime::llvm::StringRef, std::__1::function<swift::Demangle::__runtime::Node* (swift::Demangle::__runtime::SymbolicReferenceKind, swift::Demangle::__runtime::Directness, int, void const*)>) ()
#2 0x0000000185853d10 in _findExtendedTypeContextDescriptor(swift::TargetContextDescriptor<swift::InProcess> const*, swift::Demangle::__runtime::Demangler&, swift::Demangle::__runtime::Node**) ()
#3 0x000000018585588c in swift::SubstGenericParametersFromMetadata::buildDescriptorPath(swift::TargetContextDescriptor<swift::InProcess> const*, swift::Demangle::__runtime::Demangler&) const ()
#4 0x00000001858558f4 in swift::SubstGenericParametersFromMetadata::buildDescriptorPath(swift::TargetContextDescriptor<swift::InProcess> const*, swift::Demangle::__runtime::Demangler&) const ()
#5 0x00000001858558f4 in swift::SubstGenericParametersFromMetadata::buildDescriptorPath(swift::TargetContextDescriptor<swift::InProcess> const*, swift::Demangle::__runtime::Demangler&) const ()
#6 0x0000000185856028 in swift::SubstGenericParametersFromMetadata::setup() const ()
#7 0x000000018586237c in std::__1::__function::__func<(anonymous
But after much trial and error, we eventually realized this was stemming from a stack overflow. We verified that was the case by increasing the stack size by setting Other Link Flags to -Wl,-stack_size,0x10000000
, which gave us a larger stack size and avoided the bad access at runtime. We also found the offending struct, changed it to a class, and were unable to see a crash after that change.
It also turned out to be the case that the stack trace was a red herring. Especially as continued runs would surface subtly different traces. We also never saw a typical stack overflow crash that would reference chkstk_darwin
in the stack trace. So I'm not sure what to make of that.
The offending struct is rather large yet has existed in our app for years. The difference now is that we are adopting SwiftUI, and this struct is being used in @Environment objects. It's unclear to me if @Environment is somehow copying the data on the stack which is causing the stack inflation to happen, or if simply nesting a large struct type within another SwiftUI.View struct is causing excess stack memory growth.
What's also alarming to me, is that some devices saw this crash consistently while other devices never saw a crash. One of my colleague's iPhone 13 mini (running iOS 16.5) saw the crash 100% of the time on her phone, while my iPhone mini 13 (running iOS 16.1.1) never saw the crash. I was able to eventually force the crash by artificially inflating the struct by adding many unused properties to the struct to arbitrarily extend its size.
I'm looking for a few ideas here: 1) on how we can mitigate this in the future, 2) how we can currently measure this and warn ourselves if our stack baseline gets too high, and 3) why we're seeing the stack size inconsistencies across devices.
- For my first question, we haven't yet been forced to adopt Copy-on-Write semantics but we realize this would facilitate one solution to our problem. I don't want to outright ban large structs, thus throwing out the baby with the bathwater, but I'm curious if there's any guidance or community wisdom we can lean on here. There's already at least one great forum post that shares some shared pains and strategies, but that's also 2 years old so it's possible ideas and solutions have evolved since then.
I'd also be interested in seeing examples of Copy-on-Write wrappers, macros (for when we can move to Xcode 15), or any other fancy solutions here that are more ergonomic than some of the solutions I've seen.
-
In terms of measuring the stack, is this something that Instruments can already inform us of? We came across another post that has code to calculate the existing stack size. It doesn't seem sufficient enough to be a trustworthy measuring stick during long-running instrumentation, but might be another data point in tracking down future mysterious crashes.
-
What would be the reason that only some of our devices are seeing the stack overflow crash (it being 100% consistent for those devices)? Does anyone have documentation or anecdotal evidence that suggests the stack size for apps can be varied under different circumstances? What's really bad here is that none of us saw this issue until we went to TestFlight which then surfaced the crash for 100% of our testers, and we eventually had to locate test devices that also experienced the crash.