Why did my custom-built toolchain crash in SIL verification?

i suppose i brought this on myself by trying to use a “custom built toolchain” in the first place. but sadly these days we have no other choice on the server as swift has no support Amazon Linux 2023 and nobody at Apple seems to be working on it.

when i use the official 5.9.2 toolchain for a different platform (Amazon Linux 2), my project compiles without problems. when i use my custom Amazon Linux 2023 5.9.2 toolchain, i get this strange SIL verification error:

error: compile command failed due to signal 6 (use -v to see invocation)
SIL verification failed: result of struct_extract does not match type of field
  $@moveOnly Phylum.Language
  $Phylum.Language
Verifying instruction:

< lots and lots of SIL dumped >

1.      Swift version 5.9.2 (swift-5.9.2-RELEASE)
2.      Compiling with the current language version
3.      While evaluating request ExecuteSILPipelineRequest(Run pipelines { Mandatory Diagnostic Passes + Enabling Optimization Passes } on SIL for SymbolGraphCompiler)
4.      While running pass #447 SILModuleTransform "MandatoryInlining".
...

note that SymbolGraphCompiler is a module in my project, it is not related to the swift compiler.

dockerfile:

build the toolchain:

$ docker build . -t tayloraswift/swift-toolchain-build:5.9.2 --build-arg SWIFT_BRANCH_CHECKOUT=swift-5.9.2-RELEASE --build-arg SWIFT_VERSION=5.8.1
FROM amazonlinux:2023

RUN yum -y update
# install sysadmin basics
RUN yum -y install sudo passwd

# install swift dependencies
RUN yum install shadow-utils -y
RUN yum -y group install "development tools"
RUN yum -y install \
    cmake            \
    curl-devel       \
    git              \
    glibc-static     \
    libbsd-devel     \
    libedit-devel    \
    libicu-devel     \
    libuuid-devel    \
    libxml2-devel    \
    ncurses-devel    \
    pkgconfig        \
    procps-ng        \
    python           \
    python-devel     \
    python-pkgconfig \
    python-six       \
    python3-devel    \
    python3-psutil   \
    rsync            \
    sqlite-devel     \
    swig             \
    tzdata           \
    unzip            \
    uuid-devel       \
    wget             \
    which            \
    zip

RUN mkdir -p /usr/local/lib/python3.7/site-packages/

ARG SWIFT_BRANCH_CHECKOUT=main
ARG SWIFT_PLATFORM=amazonlinux2
ARG SWIFT_VERSION=5.9.2
ARG SWIFT_BRANCH=swift-${SWIFT_VERSION}-release
ARG SWIFT_TAG=swift-${SWIFT_VERSION}-RELEASE
ARG SWIFT_WEBROOT=https://download.swift.org
ARG SWIFT_PREFIX=/opt/swift/${SWIFT_VERSION}

ENV SWIFT_BRANCH_CHECKOUT=$SWIFT_BRANCH_CHECKOUT \
    SWIFT_PLATFORM=$SWIFT_PLATFORM \
    SWIFT_VERSION=$SWIFT_VERSION \
    SWIFT_BRANCH=$SWIFT_BRANCH \
    SWIFT_TAG=$SWIFT_TAG \
    SWIFT_WEBROOT=$SWIFT_WEBROOT \
    SWIFT_PREFIX=$SWIFT_PREFIX

RUN dnf -y install dirmngr --allowerasing
RUN dnf swap gnupg2-minimal gnupg2-full

RUN set -e; \
    ARCH_NAME="$(rpm --eval '%{_arch}')"; \
    url=; \
    case "${ARCH_NAME##*-}" in \
        'x86_64') \
            OS_ARCH_SUFFIX=''; \
            ;; \
        'aarch64') \
            OS_ARCH_SUFFIX='-aarch64'; \
            ;; \
        *) echo >&2 "error: unsupported architecture: '$ARCH_NAME'"; exit 1 ;; \
    esac; \
    SWIFT_WEBDIR="$SWIFT_WEBROOT/$SWIFT_BRANCH/$(echo $SWIFT_PLATFORM | tr -d .)$OS_ARCH_SUFFIX" \
    && SWIFT_BIN_URL="$SWIFT_WEBDIR/$SWIFT_TAG/$SWIFT_TAG-$SWIFT_PLATFORM$OS_ARCH_SUFFIX.tar.gz" \
    && SWIFT_SIG_URL="$SWIFT_BIN_URL.sig" \
    && echo $SWIFT_BIN_URL \
    # - Download the GPG keys, Swift toolchain, and toolchain signature, and verify.
    && export GNUPGHOME="$(mktemp -d)" \
    && curl -fsSL "$SWIFT_BIN_URL" -o swift.tar.gz "$SWIFT_SIG_URL" -o swift.tar.gz.sig \
    && curl -fSsL https://swift.org/keys/all-keys.asc | gpg --import -  \
    && gpg --batch --verify swift.tar.gz.sig swift.tar.gz \
    # - Unpack the toolchain, set libs permissions, and clean up.
    && mkdir -p $SWIFT_PREFIX \
    && tar -xzf swift.tar.gz --directory $SWIFT_PREFIX --strip-components=1 \
    && chmod -R o+r $SWIFT_PREFIX/usr/lib/swift \
    && rm -rf "$GNUPGHOME" swift.tar.gz.sig swift.tar.gz

ENV PATH="${SWIFT_PREFIX}/usr/bin:${PATH}"

RUN yum install -y gcc-c++

RUN sudo yum install -y libmpc-devel
RUN sudo yum install -y texinfo

RUN git clone --depth 1 git://sourceware.org/git/binutils-gdb.git binutils
RUN mkdir binutils.build
RUN cd binutils.build
RUN ../binutils/configure --enable-gold --enable-plugins --disable-werror
RUN make all-gold
RUN mv gold/ld-new /usr/bin/ld.gold
RUN cd ..

RUN mkdir /swift-project
WORKDIR /swift-project

RUN git clone --depth 1 --branch $SWIFT_BRANCH_CHECKOUT https://github.com/apple/swift

WORKDIR /swift-project/swift

RUN utils/update-checkout --clone --tag $SWIFT_BRANCH_CHECKOUT

COPY preset.ini build-preset-ext.ini
RUN cat build-preset-ext.ini >> utils/build-presets.ini
RUN utils/build-script \
    --preset buildbot_linux_amazon_linux_2023 \
    install_destdir=/swift-install \
    installable_package=/swift-install/swift-TAYLORS-VERSION.tar.gz \

package the toolchain:

FROM amazonlinux:2023

RUN yum -y update
# install sysadmin basics
RUN yum -y install sudo passwd
# install swift dependencies
RUN yum -y install \
    binutils \
    gcc \
    git \
    unzip \
    glibc-static \
    gzip \
    libbsd \
    libcurl-devel \
    libedit \
    libicu \
    libstdc++-static \
    libuuid \
    libxml2-devel \
    tar \
    tzdata \
    zlib-devel

COPY --from=tayloraswift/swift-toolchain-build:5.9.2 /usr/bin/ld.gold /usr/bin/ld.gold
# install swift
COPY --from=tayloraswift/swift-toolchain-build:5.9.2 /swift-install /swift-install

RUN cp -r /swift-install/usr/bin/* /usr/bin
RUN cp -r /swift-install/usr/include/* /usr/include/
RUN cp -r /swift-install/usr/libexec/* /usr/libexec/
RUN cp -r /swift-install/usr/lib/* /usr/lib/
RUN cp -r /swift-install/usr/local/* /usr/local
RUN cp -r /swift-install/usr/share/* /usr/share

# create the `ec2-user`, and switch to her
RUN useradd -ms /bin/bash ec2-user
RUN passwd -d ec2-user
RUN usermod -aG wheel ec2-user
USER ec2-user

WORKDIR /home/ec2-user/

# optional, but python, and iptables are very useful in a container
RUN sudo yum -y install python3 python3-devel iptables nc

# jemalloc
RUN sudo yum -y install bzip2 make
RUN curl https://github.com/jemalloc/jemalloc/releases/download/5.3.0/jemalloc-5.3.0.tar.bz2 \
    -L -o jemalloc-5.3.0.tar.bz2
RUN tar -xf jemalloc-5.3.0.tar.bz2
RUN cd jemalloc-5.3.0 && ./configure && make && sudo make install

# generate script that will run on terminal creation,
# enables showing the PWD prompt
RUN echo "PS1='\w\$ '" >> .bashrc
RUN echo "force_color_prompt=yes" >> .bashrc
ENV TERM xterm-256color

CMD sleep infinity

preset extension:

[preset: buildbot_linux_amazon_linux_2023]
mixin-preset=
    buildbot_linux
    mixin_buildbot_linux,no_test
skip-early-swift-driver

perhaps running the test suites would illuminate the issue, but as discussed on another thread, those needed to be disabled in order to successfully build the toolchain in the first place.

One possibility is that you built your toolchain with assertions enabled, but Amazon's toolchain was built with them disabled, and the assertion violation happens to be benign enough to not cause any miscompiles. The @moveOnly markers would arise from using borrowing and consuming modifiers on parameters, but they don't affect the layout of the type, so there's a decent chance that with assertions disabled the code skates by. If you want to work around the assertion failure you might try removing those modifiers from the parameters on the affected function for the time being, or switching them for the __shared/__owned modifiers that don't impose local no-implicit-copy constraints as a workaround.

1 Like

interesting, i assumed the official toolchains were built with assertions enabled, but i will try a new build this afternoon with assertions disabled. (i’m assuming by Amazon’s toolchain you’re referring to the Apple-distributed docker image for Amazon Linux 2, as Amazon does not distribute swift compilers.)

1 Like

update: i reduced one instance of a compiler crasher to this snippet, which lived inside a module named SymbolGraphLinker:

struct B
{
    let x:Bool
    let y:String
}
struct C
{
    init(b:borrowing B)
    {
        if  b.x
        {
            let _:String = b.y
        }
    }
}

strangely, i could not reproduce the crash anywhere else, including:

  • on Swift Fiddle with its 5.9.2 compiler
  • on Swift Fiddle with any of its nightly toolchains
  • locally, with this compiler, compiling the sample code in isolation
  • locally, with this compiler, compiling the sample as an SPM Snippet
  • locally, with this compiler, compiling the sample as a named SPM module
  • locally, with this compiler, compiling SymbolGraphLinker with the sample in its own file

but i was able to reproduce the crash:

  • locally, with this compiler, compiling SymbolGraphLinker with the sample in its own file in release mode (e.g. with whole-module optimizations). it also took an abnormally long time and an abnormally high amount of memory (>5 GB) for a module that is normally quite speedy to compile.

not your everyday compiler crasher!

incredibly, the crash only occurs when the snippet appears in the same file as an extension to a completely unrelated type. this had nothing to do with the layout of my project, or SymbolGraphLinker - it is possible to reproduce this crash by adding an unrelated class with a convenience initializer in the same file:

struct B
{
    let x:Bool
    let y:String
}
struct C
{
    init(b:borrowing B)
    {
        if  b.x
        {
            let _:String = b.y
        }
    }
}

class S
{
    init(x:Bool)
    {
    }
}
extension S
{
    convenience
    init()
    {
        self.init(x: true)
    }
}

your theory about assertions being disabled in the official docker image toolchains was correct - i cannot reproduce the crash on Swift Fiddle using the 5.9.2 release toolchain, but i was able to reproduce it using the 5.9 branch toolchain:

Swift version 5.9.2-dev (LLVM 2b42c5ce063a374, Swift 9067148bc9c9a72)
Target: x86_64-unknown-linux-gnu
SIL verification failed: result of struct_extract does not match type of field
  $@moveOnly Bool
  $Bool
Verifying instruction:
     %5 = begin_borrow %0 : $B                    // users: %8, %6
->   %6 = struct_extract %5 : $@moveOnly B, #B.x  // user: %7
     %7 = struct_extract %6 : $Bool, #Bool._value // user: %9
In function:
// C.init(b:)
sil hidden [ossa] @$s4main1CV1bAcA1BVh_tcfC : $@convention(method) (@guaranteed B, @thin C.Type) -> C {
// %0 "b"                                         // users: %4, %5, %3
// %1 "$metatype"
bb0(%0 : @noImplicitCopy @guaranteed $B, %1 : $@thin C.Type):
  %2 = alloc_stack $C, var, name "self", implicit // users: %12, %13
  %3 = copyable_to_moveonlywrapper [guaranteed] %0 : $B
  debug_value [moveable_value_debuginfo] %0 : $B, let, name "b", argno 1 // id: %4
  %5 = begin_borrow %0 : $B                       // users: %8, %6
  %6 = struct_extract %5 : $@moveOnly B, #B.x     // user: %7
  %7 = struct_extract %6 : $Bool, #Bool._value    // user: %9
  end_borrow %5 : $@moveOnly B                    // id: %8
  cond_br %7, bb1, bb2                            // id: %9
bb1:                                              // Preds: bb0
  br bb3                                          // id: %10
bb2:                                              // Preds: bb0
  br bb3                                          // id: %11
bb3:                                              // Preds: bb1 bb2
  %12 = load [trivial] %2 : $*C                   // user: %14
  dealloc_stack %2 : $*C                          // id: %13
  return %12 : $C                                 // id: %14
} // end sil function '$s4main1CV1bAcA1BVh_tcfC'
Stack dump:
0.      Program arguments: /usr/bin/swift-frontend -frontend -interpret - -disable-objc-interop -I swiftfiddle.com/_Packages/.build/release -new-driver-path /usr/bin/swift-driver -empty-abi-descriptor -resource-dir /usr/lib/swift -module-name main -plugin-path /usr/lib/swift/host/plugins -plugin-path /usr/local/lib/swift/host/plugins -l_Packages
1.      Swift version 5.9.2-dev (LLVM 2b42c5ce063a374, Swift 9067148bc9c9a72)
2.      Compiling with the current language version
3.      While evaluating request ExecuteSILPipelineRequest(Run pipelines { Mandatory Diagnostic Passes + Enabling Optimization Passes } on SIL for main)
4.      While running pass #96 SILModuleTransform "MandatoryInlining".
5.      While verifying SIL function "@$s4main1CV1bAcA1BVh_tcfC".
 for 'init(b:)' (at <stdin>:8:5)
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
/usr/bin/swift-frontend(+0x70ab263)[0x55fc7bbdc263]
/usr/bin/swift-frontend(+0x70a8fae)[0x55fc7bbd9fae]
/usr/bin/swift-frontend(+0x70ab5da)[0x55fc7bbdc5da]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f62f608a520]
/lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c)[0x7f62f60de9fc]
/lib/x86_64-linux-gnu/libc.so.6(raise+0x16)[0x7f62f608a476]
/lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x7f62f60707f3]
/usr/bin/swift-frontend(+0x1f5c9d6)[0x55fc76a8d9d6]
/usr/bin/swift-frontend(+0x1f7b813)[0x55fc76aac813]
/usr/bin/swift-frontend(+0x1f611d0)[0x55fc76a921d0]
/usr/bin/swift-frontend(+0x1f5f8b9)[0x55fc76a908b9]
/usr/bin/swift-frontend(+0x1f58363)[0x55fc76a89363]
/usr/bin/swift-frontend(+0x1f5ba93)[0x55fc76a8ca93]
/usr/bin/swift-frontend(+0x1d1ec90)[0x55fc7684fc90]
/usr/bin/swift-frontend(+0x1e5e842)[0x55fc7698f842]
/usr/bin/swift-frontend(+0x1e624eb)[0x55fc769934eb]
/usr/bin/swift-frontend(+0x1ab5d2b)[0x55fc765e6d2b]
/usr/bin/swift-frontend(+0x1ab5042)[0x55fc765e6042]
/usr/bin/swift-frontend(+0x15fb2a2)[0x55fc7612c2a2]
/usr/bin/swift-frontend(+0x15fd88a)[0x55fc7612e88a]
/usr/bin/swift-frontend(+0x15f7d08)[0x55fc76128d08]
/usr/bin/swift-frontend(+0x15f7cbd)[0x55fc76128cbd]
/usr/bin/swift-frontend(+0x161926a)[0x55fc7614a26a]
/usr/bin/swift-frontend(+0x1605843)[0x55fc76136843]
/usr/bin/swift-frontend(+0x15f7f05)[0x55fc76128f05]
/usr/bin/swift-frontend(+0x16074f1)[0x55fc761384f1]
/usr/bin/swift-frontend(+0x10687b7)[0x55fc75b997b7]
/usr/bin/swift-frontend(+0xde0114)[0x55fc75911114]
/usr/bin/swift-frontend(+0xddf3ba)[0x55fc759103ba]
/usr/bin/swift-frontend(+0xdf4fc5)[0x55fc75925fc5]
/usr/bin/swift-frontend(+0xde2479)[0x55fc75913479]
/usr/bin/swift-frontend(+0xde11a5)[0x55fc759121a5]
/usr/bin/swift-frontend(+0xc2a30b)[0x55fc7575b30b]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7f62f6071d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7f62f6071e40]
/usr/bin/swift-frontend(+0xc29a65)[0x55fc7575aa65]
*** Signal 11: Backtracing from 0x7f62f6070898...
 done ***
*** Program crashed: Bad pointer dereference at 0x0000000000000000 ***
Thread 0 "swift-frontend" crashed:
0  0x00007f62f6070898 <unknown> in libc.so.6
Registers:
rax 0x0000000000000000  0
rdx 0x00007f62f5d79840  40 98 d7 f5 62 7f 00 00 60 a2 d7 f5 62 7f 00 00  @·×õb···`¢×õb···
rcx 0x00007f62f60de9fc  41 89 c5 41 f7 dd 3d 00 f0 ff ff b8 00 00 00 00  A·ÅA÷Ý=·ðÿÿ¸····
rbx 0x0000000000000006  6
rsi 0x0000000000000001  1
rdi 0x0000000000000001  1
rbp 0x00007f62f6263e90  01 00 00 00 01 00 00 00 40 98 d7 f5 62 7f 00 00  ········@·×õb···
rsp 0x00007fffa0b01c40  20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ···············
 r8 0x0000000000000000  0
 r9 0x0000000000000000  0
r10 0x0000000000000008  8
r11 0x0000000000000246  582
r12 0x0000000000000018  24
r13 0x00007fffa0b01f50  ef 63 5b 7d fc 55 00 00 e0 20 b0 a0 ff 7f 00 00  ïc[}üU··à ° ÿ···
r14 0x00007fffa0b026d8  00 8a 31 81 fc 55 00 00 80 b2 44 81 fc 55 00 00  ··1·üU···²D·üU··
r15 0x000055fc8146e5f0  24 73 34 6d 61 69 6e 31 43 56 31 62 41 63 41 31  $s4main1CV1bAcA1
rip 0x00007f62f6070898  f4 83 3d 00 36 1f 00 05 75 14 c7 05 f4 35 1f 00  ô·=·6···u·Ç·ô5··
rflags 0x0000000000010246  ZF PF
cs 0x0033  fs 0x0000  gs 0x0000
Images (25 omitted):
0x00007f62f6048000–0x00007f62f6204341 c289da5071a3399de893d2af81d6a30c62646e1e libc.so.6 /usr/lib/x86_64-linux-gnu/libc.so.6
Backtrace took 0.15s

the crash does not occur on the 5.10 nightlies, or on main. i would otherwise be considering switching to a 5.10 snapshot, but that toolchain has a broken symbol graph generator, so i am stuck with 5.9.


aside: as i was trying to get this project to compile again, i found there were dozens of locations that were tripping this assertion in debug mode, and countless more when building the package with whole-module optimizations enabled. using a formatter to replace all occurrences of borrowing/consuming with __shared/__owned/__consuming feels drastic and near-termish, and there are also a lot of ~Copyable types in the project that really do require borrowing and not just __shared. to be honest, i am really regretting that i ever started using the new ownership specifiers on copyable types in the first place.

1 Like