How to opt-out of swift 5.9 “interactive exit” behavior?

it used to be that if you throw an error from main, then swift would print the error and exit with a non-zero status code that could be picked up by CI, etc.

Swift/ErrorType.swift:200: Fatal error: Error raised at top level: Some tests failed (1 of 3 test(s), 1 of 6 assertions(s))
Current stack trace:
0    libswiftCore.so                    0x00007f5465507750 _swift_stdlib_reportFatalErrorInFile + 113
1    libswiftCore.so                    0x00007f54651dd512 <unavailable> + 1459474
2    libswiftCore.so                    0x00007f54651dd32c <unavailable> + 1458988
3    libswiftCore.so                    0x00007f54651dc260 _assertionFailure(_:_:file:line:flags:) + 372
4    libswiftCore.so                    0x00007f54652418c9 <unavailable> + 1870025
5    MarkdownPluginSwiftTests           0x00005584e5b032f7 <unavailable> + 1139447
6    libc.so.6                          0x00007f546426e050 __libc_start_main + 234
7    MarkdownPluginSwiftTests           0x00005584e5a8c0fa <unavailable> + 651514

but now it doesn’t exit, it just transitions to some kind of interactive debugging console:


💣 Program crashed: Illegal instruction at 0x00007f2bf9e673e0

Thread 0 "MarkdownPluginS" crashed:

0 0x00007f2bf9e673e0 _assertionFailure(_:_:file:line:flags:) + 384 in libswiftCore.so

Press space to interact, D to debug, or any other key to quit (26s) 

this is really annoying and not at all helpful! not only does it affect the swift run command, it is apparently enabled even if you run the compiled binaries directly.

99% of the time when an error gets thrown from main, it is because the program failed (invalid input, resource does not exist, etc. etc.) and you do not want to start a debugging session.

is there a way to opt out of this behavior?

7 Likes

There are a number of things to say here. The first is that throwing an error from the top level crashes your program. It is not a clean exit. For that, you should return or call the C library’s exit function.

If you crash your program, you will trigger the new crash handler that is built into the runtime, which is what you’re seeing here.

The second thing to say is that the crash handler is (by default) only interactive if run from a terminal. If you run your program from a CI system, or from some kind of orchestration system or pipeline where its output is redirected, it will not be interactive. You will also note that if left alone, it will time out. When it does so, your program will exit in exactly the way it did before, with the same return code you would have got previously. This is by design.

Thirdly, it is possible to control this behaviour through the SWIFT_BACKTRACE environment variable. See Backtracing.rst in the Swift repo for details of the options you can set there.

7 Likes

We do try to encourage people to minimize the distinction between "crash" and "exit with failure", though, by steering people away from designs that depend on something happening before a process exits, since that's generally impossible to guarantee. It ought to be reasonable for an internal tool that has no need for end-of-process cleanup, and no UX reason to try to present a nicer error message, to rely on runtime traps to end their process with a fatal error.

3 Likes

Perhaps, but right now that fatal error causes an actual crash, rather than a normal exit with a non-zero result code. The parent process and operating system will be aware of the difference and may behave differently as a result also.

If a program wants to do a “quick exit” without clearing up, calling exit is the way to do that (currently), and lets the caller specify a result code.

Having read through the thread linked by the OP, it seems that some thought should be put into the runtime behaviour where, for instance, an Error is thrown at top level, and it would be good to have a way to trigger an exit without reaching down to the C library also.

I also believe it’s how we could communicate exit errors from a swiftpm command tool plugin, it’s what we did here: https://github.com/ordo-one/package-benchmark/blob/070dee37abe72c0a3d7e0feff5dca3effd66b541/Plugins/BenchmarkCommandPlugin/BenchmarkCommandPlugin.swift#L338

I believe just using exit there didn’t work as expected, so we had to throw to get the error to propagate out. This is an issue with the swiftpm plugins though, but just thought I’d mentioned it as it was related (we’ll check with exit again, but pretty sure it didn’t work in that context).

Edit: maybe my memory had a parity error and it was to get a nicer output, will check it again.

1 Like

this… doesn’t seem to be what’s actually happening

💣 Program crashed: Illegal instruction at 0x00007f777f52d470

Thread 4 crashed:

0 0x00007f777f52d470 _assertionFailure(_:_:file:line:flags:) + 384 in libswiftCore.so


Press space to interact, D to debug, or any other key to quit (30s) 
Press space to interact, D to debug, or any other key to quit (29s) 
Press space to interact, D to debug, or any other key to quit (28s) 
Press space to interact, D to debug, or any other key to quit (27s) 
Press space to interact, D to debug, or any other key to quit (26s) 
Press space to interact, D to debug, or any other key to quit (25s) 
Press space to interact, D to debug, or any other key to quit (24s) 
Press space to interact, D to debug, or any other key to quit (23s) 
Press space to interact, D to debug, or any other key to quit (22s) 
Press space to interact, D to debug, or any other key to quit (21s) 
Press space to interact, D to debug, or any other key to quit (20s) 
Press space to interact, D to debug, or any other key to quit (19s) 
Press space to interact, D to debug, or any other key to quit (18s) 
Press space to interact, D to debug, or any other key to quit (17s) 
Press space to interact, D to debug, or any other key to quit (16s) 
Press space to interact, D to debug, or any other key to quit (15s) 
Press space to interact, D to debug, or any other key to quit (14s) 
Press space to interact, D to debug, or any other key to quit (13s) 
Press space to interact, D to debug, or any other key to quit (12s) 
Press space to interact, D to debug, or any other key to quit (11s) 
Press space to interact, D to debug, or any other key to quit (10s) 
Press space to interact, D to debug, or any other key to quit (9s) 
Press space to interact, D to debug, or any other key to quit (8s) 
Press space to interact, D to debug, or any other key to quit (7s) 
Press space to interact, D to debug, or any other key to quit (6s) 
Press space to interact, D to debug, or any other key to quit (5s) 
Press space to interact, D to debug, or any other key to quit (4s) 
Press space to interact, D to debug, or any other key to quit (3s) 
Press space to interact, D to debug, or any other key to quit (2s) 
Press space to interact, D to debug, or any other key to quit (1s) 
.github/pipeline: line 6: 10866 Illegal instruction     (core dumped) $f

this pipeline takes about 18 minutes to complete, so an extra 30-second delay isn’t the end of the world. but it’s obviously not ideal.

here is the CI pipeline script: https://github.com/tayloraswift/swift-unidoc/blob/13bf17794c7277a9f92f0c3b80d7e538005265d8/.github/pipeline

2 Likes

To clarify, if your CI system runs the program in a pty as opposed to a pipe, it will still be interactive by default. You can explicitly turn that off with SWIFT_BACKTRACE=interactive=no if you're in that situation.

1 Like

(The timeout, FWIW, was explicitly intended to prevent exactly this kind of situation from turning from a minor inconvenience into a total disaster.)

not to come back to this for no reason, but the UX when mixed with interactive docker containers is just awful. it doesn’t even give you a chance to Ctrl-C out from the interactive mode, it just hangs for 30 seconds until it times out.

$ docker run -it --entrypoint=/bin/bash --rm ...

...

Current stack trace:
0    libswiftCore.so                    0x00007f9dfcc1a6d0 _swift_stdlib_reportFatalErrorInFile + 109
1    libswiftCore.so                    0x00007f9dfc8e6d85 <unavailable> + 1461637
2    libswiftCore.so                    0x00007f9dfc8e6ba7 <unavailable> + 1461159
3    libswiftCore.so                    0x00007f9dfc8e5b10 _assertionFailure(_:_:file:line:flags:) + 342
4    libswiftCore.so                    0x00007f9dfc94cc5c <unavailable> + 1879132
5                                       0x0000556687358d86 <unavailable> + 13573510
6    libswift_Concurrency.so            0x00007f9dfc75c54d <unavailable> + 300365
7    libswift_Concurrency.so            0x00007f9dfc75cd20 swift_job_run + 92
8    libdispatch.so                     0x00007f9dfc4e88a8 <unavailable> + 190632
9    libdispatch.so                     0x00007f9dfc4e9489 <unavailable> + 193673
10   libdispatch.so                     0x00007f9dfc4f02b6 <unavailable> + 221878
11   libc.so.6                          0x00007f9dfb712ac3 <unavailable> + 608963
12   libc.so.6                          0x00007f9dfb7a3bb0 clone + 68
^C
^C





^CIllegal instruction (core dumped)
root@8fe8fbf7b99d:/#

i really wish there were a way to disable this in the compiled binary itself instead of fiddling around with environment variables in every container the binary runs in.

I don't know why you aren't seeing the output from the new backtracer there (you're only seeing the output from the old code that fatalError() uses, at least looking at the output you provided?)

Would you be able to share some more details about this? e.g. which container you're using as a base image here, whether you've done any redirection yourself, whether the container directly runs the Swift program or whether something else is wrapping it somehow?

To add to that, when I build

@main
struct DockerCrashTest {
  static func main() {
    fatalError("I am the walrus!")
  }
}

with

FROM swift:5.9.1-jammy

RUN mkdir /src

COPY dockerCrashTest.swift /src/dockerCrashTest.swift

RUN swiftc -g -Xcc -fno-omit-frame-pointer \
    -parse-as-library /src/dockerCrashTest.swift \
    -o /usr/bin/dockerCrashTest

CMD /usr/bin/dockerCrashTest

then do e.g.

$ docker run -it --rm dockercrashtest

I get the expected behaviour — that is, the program crashes, and since we're attached to a terminal, the interactive backtracer kicks in, so I see:

dockerCrashTest/dockerCrashTest.swift:4: Fatal error: I am the walrus!
Current stack trace:
0    libswiftCore.so                    0x0000ffff87beaaa8 _swift_stdlib_reportFatalErrorInFile + 128
1    libswiftCore.so                    0x0000ffff878feac4 <unavailable> + 1436356
2    libswiftCore.so                    0x0000ffff878fdc00 _assertionFailure(_:_:file:line:flags:) + 248
3    dockerCrashTest                    0x0000aaaabc440c0c <unavailable> + 3084
4    dockerCrashTest                    0x0000aaaabc440c18 <unavailable> + 3096
5    dockerCrashTest                    0x0000aaaabc440c30 <unavailable> + 3120
6    libc.so.6                          0x0000ffff873973fc <unavailable> + 160764
7    libc.so.6                          0x0000ffff87397434 __libc_start_main + 152
8    dockerCrashTest                    0x0000aaaabc440930 <unavailable> + 2352

💣 Program crashed: System trap at 0x0000ffff878fdd00

Thread 0 "dockerCrashTest" crashed:

0 0x0000ffff878fdd00 _assertionFailure(_:_:file:line:flags:) + 256 in libswiftCore.so
1 static DockerCrashTest.main() + 135 in dockerCrashTest at /src/dockerCrashTest.swift:4:5

     2│ struct DockerCrashTest {
     3│   static func main() {
     4│     fatalError("I am the walrus!")
      │     ▲
     5│   }
     6│ }

2 main + 11 in dockerCrashTest at /src/dockerCrashTest.swift:2:8

     1│ @main
     2│ struct DockerCrashTest {
      │        ▲
     3│   static func main() {
     4│     fatalError("I am the walrus!")

3 0x0000ffff873973fc <unknown> in libc.so.6
4 0x0000ffff873974cc <unknown> in libc.so.6

Press space to interact, D to debug, or any other key to quit (30s) 

and can press any key to terminate immediately.

I'm not sure what's going wrong for you specifically, but I'm very interested to find out and, if necessary, fix things so that it works the way it's supposed to.

i can confirm this doesn’t hang in a container with the same configuration, it must be something specific to the larger NIO application.

in a container launched from the real image, nothing appears for thirty seconds after fatalError prints its message, then the backtrace output shows up.

$ docker run -it --rm --entrypoint=/bin/bash tayloraswift/unidoc:latest
root@8294c61b0452:/# unidoc-build 
UnidocBuild/Main.Options.swift:43: Fatal error: Usage: unidoc-build <package>




... (30 seconds) ...




💣 Program crashed: Illegal instruction at 0x00007f49b1692c72

Thread 0 "unidoc-build" crashed:

0 0x00007f49b1692c72 _assertionFailure(_:_:file:line:flags:) + 354 in libswiftCore.so
1 0x00007f49b1509d7c swift_job_run + 91 in libswift_Concurrency.so

Illegal instruction (core dumped)

the base image is just a FROM swift:5.9.1. i have not done any redirection and the container doesn’t run the swift program directly, i am launching containers manually and then invoking the swift program through shell commands.

here’s what the dockerfile looks like, it copies a prebuilt swift program into the container.

FROM swift:5.9.1

RUN apt update && apt install -y \
    sqlite3 libsqlite3-dev \
    libcurl4-openssl-dev  \
    libgtk-3-dev clang \
    libjemalloc-dev \
    libcap2-bin

COPY .build/x86_64-unknown-linux-gnu/release/UnidocServer /bin/unidoc-server
COPY .build/x86_64-unknown-linux-gnu/release/UnidocBuild /bin/unidoc-build

RUN chmod +x /bin/unidoc-server
RUN chmod +x /bin/unidoc-build

RUN setcap CAP_NET_BIND_SERVICE=+eip /bin/unidoc-server

CMD swift --version

i’m not sure if it matters, but the swift program is compiled in an Amazon Linux 2 container, which might be slightly different from the generic swift image it runs in.


i’m sure there’s an interesting reason why this is getting stuck, but in the meantime it would be really great if i could just turn off this feature at compilation time.

i’m sure there’s an interesting reason why this is getting stuck, but in the meantime it would be really great if i could just turn off this feature at compilation time.

There isn't a way to do that at present, and adding such a thing wouldn't be a trivial exercise either, plus there is already a way to disable it using an environment variable (which is very easy to set in a Dockerfile), so it's better IMO to focus on why it's behaving the way it is so that we can fix the problem.

Your latest message suggests to me that maybe the problem here isn't the interactivity delay that you think it is; my guess is that if you set SWIFT_BACKTRACE=interactive=no in that container, it will still take 30s or so to display the backtrace. Is UnidocServer Open Source? Can I build it and take a look at what's going on?

1 Like

yes, it is from the swift-unidoc project (guide).

i wouldn’t try to run the UnidocServer, it is comparatively a lot of effort to set up (but if you want, you can follow the steps in the guide.)

probably if i were you, i’d try running the UnidocBuild tool instead, which is easier to test and exhibits the same problem. it comes with a docker image.

the easiest way to trigger a fatal error is to just invoke the command without any of its required arguments.

$ docker run -it --rm --entrypoint=/bin/bash tayloraswift/unidoc:latest
root@bbce10ca8dc8:/# unidoc-build
UnidocBuild/Main.Options.swift:43: Fatal error: Usage: unidoc-build <package>





OK, I think what we're seeing here is the same performance problem that I've had reported elsewhere, where the backtracer is just taking a long time parsing its way through all the debug information. I'll give this particular image a test with the fixes I'm planning so make sure that it improves things.

1 Like

Just with the code I'm testing, timings on my test machine go down from 16s total to just under 4s total.

3 Likes

i am coming around to the idea that it is really important to handle an Error thrown from @main differently from an actual fatalError, preconditionFailure, etc.

i was investigating a cascade failure today and discovered a really nasty intersection between Error and backtrace collection.

  1. a host machine comes under memory pressure, causing network components to throw errors at the top level, due to inability to communicate with other processes on the same machine.

  2. the network error gets propogated from Main.main through the swift runtime, which prints the error and then calls fatalError to exit the program

  3. backtrace collection kicks in, which consumes even more memory and CPU, pressuring the other processes even further

  4. when systemctl tries to restart the application, it experiences another network error, because the other processes were competing with backtrace collection and themselves failed

  5. repeat until somebody notices

the root cause of this problem was step 2; exiting with the thrown Error should not have triggered backtrace collection - all the relevant information about the failure is described by the thrown Error, and the backtracer should never have run in the first place.

an immediate fix would be to just swallow all errors from main and return “successfully” to bypass backtrace collection. but it would be better if the following just worked:

@main
enum Main
{
    public static
    func main() async throws
}

because as it stands, throws is actively dangerous.

1 Like

Yes, this is the problem with the “let it crash” philosophy. Crashing can itself cause positive feedback loops.

1 Like