Illegal instruction… but where?

taylorswift · November 7, 2023, 2:14am

okay, so i spent some time trying to figure out what is different between a docker environment and the server environment, so i used a small test program and ran it inside a docker container.

func outer()
{
    inner()
}

func inner()
{
    1 + { Int.max }()
}

outer()

it generates the expected backtrace.

$ ./crash 

💣 Program crashed: Illegal instruction at 0x000055f7680b1a25

Thread 0 "crash" crashed:

0 0x000055f7680b1a25 inner() + 21 in crash
1 0x000055f7680b1a09 outer() + 8 in crash
2 0x000055f7680b19f9 main + 8 in crash

Illegal instruction (core dumped)

then i copied it to a real EC2 instance, and tried running it there. no backtrace.

[ec2-user@ip-172-31-90-47 ~]$ bin/crash 
Illegal instruction (core dumped)

so i tried to brainstorm possible reasons why it works in a docker container but not in a real cloud instance.

the docker container runs Amazon Linux 2, but the real EC2 machines today use AL2023, which is not officially supported by swift.
i install the swift runtime manually on the EC2 machines by sudo-copying the usr/lib directory of the toolchain to /usr.

i had always assumed that was sufficient to run swift programs on a server, but then i took a second look at a modern (5.9) toolchain distribution and then it dawned on me:

$ ls toolchain/
usr
$ ls toolchain/usr
bin  include  lib  libexec  local  share
$ ls toolchain/usr/libexec/
swift
$ ls toolchain/usr/libexec/swift/
linux
$ ls toolchain/usr/libexec/swift/linux/
swift-backtrace

everything makes sense now! (famous last words)

$ sudo cp -r toolchain/usr/libexec/swift /usr/libexec
$ bin/crash 

💣 Program crashed: Illegal instruction at 0x000055556ccf2a25

Thread 0 "crash" crashed:

0 0x000055556ccf2a25 inner() + 21 in crash
1 0x000055556ccf2a09 outer() + 8 in crash
2 0x000055556ccf29f9 main + 8 in crash

then i packaged the test program into a daemon and activated the daemon to see if anything is different. the daemon also gets backtraces now, but for some reason, they look slightly different.

*** Program crashed: Illegal instruction at 0x00005581d997ba25 ***
Thread 0 "crash" crashed:
0      0x00005581d997ba25 inner() + 21 in crash
1 [ra] 0x00005581d997ba09 outer() + 8 in crash
2 [ra] 0x00005581d997b9f9 main + 8 in crash
Registers:
rax 0x8000000000000001  9223372036854775809
rdx 0x0000000000000000  0
rcx 0x0000000000000000  0
rbx 0x00007ffc84f648c8  c2 4e f6 84 fc 7f 00 00 d2 4e f6 84 fc 7f 00 00  ÂNö·ü···ÒNö·ü···
rsi 0x0000000000000001  1
rdi 0x00007ffc84f648b8  a9 4e f6 84 fc 7f 00 00 00 00 00 00 00 00 00 00  ©Nö·ü···········
rbp 0x00007ffc84f64780  90 47 f6 84 fc 7f 00 00 09 ba 97 d9 81 55 00 00  ·Gö·ü····º·Ù·U··
rsp 0x00007ffc84f64780  90 47 f6 84 fc 7f 00 00 09 ba 97 d9 81 55 00 00  ·Gö·ü····º·Ù·U··
 r8 0x0000000000000000  0
 r9 0x0000000000000000  0
r10 0x00007fd43a0101a8  3f 17 00 00 12 03 0b 00 90 b7 3f 00 00 00 00 00  ?·········?·····
r11 0x00007fd43a3fb790  41 56 53 50 49 89 fe 48 89 3c 24 48 8b 5f 08 48  AVSPI·þH·<$H·_·H
r12 0x00007ffc84f648b8  a9 4e f6 84 fc 7f 00 00 00 00 00 00 00 00 00 00  ©Nö·ü···········
r13 0x00005581d997b9f0  55 48 89 e5 e8 07 00 00 00 31 c0 5d c3 0f 1f 00  UH·åè····1À]Ã···
r14 0x0000000000000000  0
r15 0x00007fd43a882000  20 32 88 3a d4 7f 00 00 12 00 00 00 00 00 00 00   2·:Ô···········
rip 0x00005581d997ba25  0f 0b 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 48  ··f········UH·åH
rflags 0x0000000000010202
cs 0x0033  fs 0x0000  gs 0x0000
Images (17 omitted):
0x00005581d997b000–0x00005581d997bca8 <no build ID> crash /home/ec2-user/bin/crash

so far, so good.

next i tried triggering the restart endpoint i implemented in the real swiftinit server. no backtrace.

Nov 07 01:58:51 ip-172-31-90-47.ec2.internal launch-server[760366]: UnidocServer/Server.Endpoint.Admin.swift:63: Fatal error: Restarting server...
Nov 07 01:58:51 ip-172-31-90-47.ec2.internal launch-server[760365]: /home/ec2-user/bin/launch-server: line 6: 760366 Illegal instruction     /home/ec2-user/bin/UnidocServer --mongo localhos>

i tried this multiple times (YOLO) in case the process needed to rediscover the /usr/libexec/swift/linux/swift-backtrace binary, but still did not get any backtraces.

so it seems that the missing runtime library was necessary, but insufficient to obtain backtraces.

one thing i noticed is that systemctl does not wait the usual few seconds it takes to collect the backtrace. so i thought this might be related to how the binary is being invoked. but launching the server directly without any intermediate scripting layers didn’t produce backtraces either, it also prints

UnidocServer/Server.Endpoint.Admin.swift:63: Fatal error: Restarting server...
Illegal instruction

and exits immediately.

the more things i try, the less i understand about this…