Crash backtraces

My two cents on this:

  • While I’m happy to see in-process ideas, any such improvement really should be designed taking into account that accurate crash postmortems are not a server feature — they are a Swift development feature, full stop. I would hope that any improvement that is done is done with an eye to pulling into the base language or libraries what makes sense to share to improve the basic experience of diagnosing issues outside of Darwin in general.

  • Foundation is written largely assuming that backtraces will be available, and already has some facilities like Thread’s .callStack and .callStackSymbols properties. I will be looking very closely at anything we can use to improve these facilities and make sure users that experience eg fatalError()s can do so with the information they need even outside Darwin.

  • The out-of-process approach has been discussed but I would like to underline its usefulness — it may be that an optimal solution to this issue is less a SPM package each app has to opt in and more around the ability to improve and better control the crash reporting story in the OSes in which the app deploys. For example, Ubuntu, our official target, has a custom crash reporting facility that is tailored to the needs of that project. Could the same hooks be exploited to make the solution more useful to a local developer or to server reporting? Is there anything portable to other distributions there?

Having a in-process solution is a lower-complexity and pragmatic solution, but I’d love ideas and possibly a roadmap to get from there to more general cross-platform tooling that improves quality of life for everyone.

4 Likes

I have a draft PR up at Switch to vendored libbacktrace by ianpartridge · Pull Request #8 · swift-server/swift-backtrace · GitHub which switches that package to use a vendored libbacktrace.

Please do try it out with your projects and let me know your experience.

I would also like to point out that Foundation's callStack and callStackSymbols is designed such that it is possible to support multiple platforms. Windows does not have backtrace nor anything which works similarly. Designing a generic solution which is portable is complicated and Foundation does have such an approach. I would really prefer that we improve that and generalize it such that it can serve the needs of the other projects.

4 Likes

We appreciate your effort and have started testing your branch. I believe this package's output (once passed through addr2line) is giving me identical results to the python script merged by https://github.com/apple/swift/pull/4479.

Unfortunately neither output seems to be able to be demangled by swift demangle. But I haven't yet tested the @_silgen_name internal method's output (I can't imagine it's different).

What is the missing piece to getting fully symbolicated function names?

How are you invoking swift-demangle? Because my package writes the stacktrace to stderr you'll need to make sure that stderr is piped to the demangler. Something like: ./YourApp 2>&1 > swift-demangle.

1 Like

I just retested, I must've been looking at the wrong window because this package works great.

From: $ss17_NativeDictionaryV16_unsafeInsertNew3key5valueyxn_q_ntFSS_20KituraTemplateEngine0iJ0_pTg5

To:
generic specialization <Swift.String, KituraTemplateEngine.TemplateEngine> of Swift._NativeDictionary._unsafeInsertNew(key: __owned A, value: __owned B) -> ()

Great to hear! I plan to implement built-in demangling soon too.

Doesn't Ubuntu's crash reporting require the user to have an Ubuntu One account?

@IanPartridge Did you see [stdlib] Cleanup callback for fatal Swift errors ? I think that describes exactly what we would need to install the custom backtrace printer logic.

1 Like

Thanks Dario, yes a "soft fault" feature sounds very useful. We will also need to hook hard faults but signals are OK I think.

1 Like

I've added support for demangling to my backtrace library at GitHub - swift-server/swift-backtrace: 💥 Backtraces for Swift on Linux and Windows.

If you depend on the master branch you will now get fully symbolicated backtraces on soft and hard faults when you build in release mode with debug enabled (-g) - no post-processing required.

Please try it out and let me know your feedback.

3 Likes

One question I have... Currently my library only hooks SIGILL because this is what the runtime raises on fatalError() etc.

We should expand it to hook other signals too, such as SIGSEGV and I am wondering which other POSIX signals are a good idea. Does the Swift runtime reserve any signals for itself that Swift programs definitely should not hook? CC: @Joe_Groff

The Swift compiler doesn't "officially" promise anything, but in practice, SIGILL, SIGTRAP, and SIGABRT ought to cover the gamut of intentional traps.

1 Like

@IanPartridge The problem I see with doing it in a signal handler is that these APIs are not signal-safe (see: signal-safety(7) - Linux manual page). They most likely allocate and may do other non-signal-safe things. There's also no way to differentiate between Swift traps and any arbitrary signal.

Adding the soft fault handler to Swift directly would allow us to safely retrieve and print the backtrace.

Feel free to use the following "crasher" app to check the various failures @IanPartridge:

I think it covers the majority of situations; Most of them are SIGILL; In our experiments we also handle SIGABRT. Very good to know that SIGTRAP is also used, thanks @Joe_Groff.

import Dispatch
import Foundation

func consumeAny<T>(_ value: T) {
    consumeAny(value)
}

func returnTrue() -> Bool {
    return true
}

class Foo {
}


func crashIntegerOverflow() {
    let x: Int8 = 127
    consumeAny(x + (returnTrue() ? 1 : 0))
}

func crashNil() {
    let x: Foo? = returnTrue() ? nil : Foo()
    consumeAny(x!)
}

func crashFatalError() {
    fatalError("deliberately crashing in fatalError")
}

func crashDivBy0() {
    consumeAny(1 / (returnTrue() ? 0 : 1))
}

func crashViaCDanglingPointer() {
    let x: Int = UnsafeMutableRawPointer(bitPattern: 0x8)!.load(fromByteOffset: 0, as: Int.self)
    consumeAny(x + 1)
}

func crashArrayOutOfBounds() {
    consumeAny(["nothing"][1])
}

func crashObjCException() {
    #if os(macOS)
    NSException(name: NSExceptionName("crash"),
        reason: "you asked for it",
        userInfo: nil).raise()
    #endif
    fatalError("objc exceptions only supported on macOS")
}

func crashStackOverflow() {
    func recurse(accumulator: Int) -> Int {
        return 1 + recurse(accumulator: accumulator + 1)
    }

    consumeAny(recurse(accumulator: 0))
}

func crashOOM() {
    #if os(macOS)
    var datas: [Data] = []
    var i: UInt8 = 1
    while true == returnTrue() {
        datas.append(Data(repeating: i, count: 1024 * 1024 * 1024))
        i += 1
    }
    consumeAny(datas)
    #endif
    fatalError("OOM currently only supported on macOS")
}

func crashRangeFromUpperBoundWhichLessThanLowerBound() {
    let values = ["one", "two"].suffix(from: 3)
}

struct FooExclusivityViolation {
    var x = 0

    mutating func addAndCall(_ body: () -> Void) {
        self.x += 1
        body()
    }
}
class BarExclusivityViolation {
    var foo = FooExclusivityViolation(x: 0)

    func doIt() {
        self.foo.addAndCall {
            self.foo.addAndCall {}
        }
    }
}
func crashExclusiveAccessViolation() {
}

let crashTests = [

    "integer-overflow": crashIntegerOverflow
    , "force-unwrap-nil": crashNil
    , "fatal-error": crashFatalError
    , "div-by-0": crashDivBy0
    , "via-C-dangling-pointer": crashViaCDanglingPointer
    , "array-out-of-bounds": crashArrayOutOfBounds
    , "objc-exception": crashObjCException
    , "stack-overflow": crashStackOverflow
    , "out-of-memory": crashOOM
    , "range-upperBound-lt-lowerBound": crashRangeFromUpperBoundWhichLessThanLowerBound
    , "exclusive-access-violation": crashExclusiveAccessViolation
]

func help() {
    let program = CommandLine.arguments[0]
    print("Choose one of the following options:")
    for key in crashTests.keys {
        print("  \(program) \(key)")
    }
}

func main() {
    let crasher = (crashTests[CommandLine.arguments.suffix(from: 1).first ?? "help"] ?? help)
    crasher() // invoke crasher
}

main()

Could you try out your handler with all those, and check if the outputs are all nice? They likely will be, though good to sanity check. If there's more failure situations we should add here let's do so -- maybe you can include this crasher as an example in your library? Want me to PR it there as sample @IanPartridge?

Having that said... As @drexin mentions, these handlers are somewhat "scary" since we don't know if it Swift faulting or some random C code sending those signals. And anything we do in those signal handlers is unsafe. I hope to some day be able to distinguish Swift "soft faults" (e.g. fatalError()) from memory corrupting "hard faults", but for now let's indeed work with what we have... so, capturing the signals.


On a separate note, @IanPartridge, would you be able to also expose the "print the backtrace nicely" separately as static func so someone could use it directly, if they handled their own signal handlers (i.e. I don't want to use the install() call, I have my own way of doing that). I think I'd basically want to call Backtrace.backtrace_full (using current naming). WDYT?

having a way to “install” your own signal capturing and still make use of the library is a great idea imo

2 Likes

When using swift-backtrace, is there a way to get debug symbols when running under swift test? The flag

-c release

doesn't work with swift test and

-Xswiftc -g

by itself doesn't seem to do the trick.

^^^ @Aciid ?

I just received a second stack trace after a test crash (from a different test) and this stack trace does include a description for some of the stack elements instead of just a stack of addresses like the previous stack trace. Some of the stack elements are just memory addresses though, so maybe

-Xswiftc g

does work with swift test and the first time I was just unlucky and the stack trace happened to include only stack elements with only addresses.

That initial crash happens occasionally (but in the same place) but appears to occur between the tests. With an essentially empty stack trace I don't know how to proceed with debugging it.

Could you share reproducers of what faults you are looking at? If it’s in a shallow test In release mode it could be that most of the offending code got inlined which could be the reason for the less satisfying trace.

-g works correctly in tests and should indeed just work and keep the debug info.

Would love to get some more information about what you are looking at. Without that it’s hard to say really what’s going on.