Compiler performance overhead of assertions and SIL verifier

typesanitizer · December 17, 2020, 6:09pm

After @dan-zheng's post on informative assertions, I made some measurements on the overhead of assertions and shared them in a comment. Turns out, it's not really a good idea to make measurements without understanding what you are measuring.

So I've fixed a bug or two and started making more systematic measurements.

Configuration

2017 10-core iMac Pro. Swift and LLVM built with Clang in Xcode 12.
Almost top-of-tree swiftc & LLVM (swift/main)
- swiftc @ 5de95a596f6c81e7084a5e854985fdaca9e2829b (current HEAD for [build-script][CMake] Define and use SWIFT_ENABLE_ASSERTIONS pervasively. by varungandhi-apple · Pull Request #35115 · apple/swift · GitHub)
- llvm with ⚙ D93433 [IR] Use LLVM_ENABLE_ABI_BREAKING_CHECKS to guard ABI changes. cherry-picked on top of 051158df6ec66bd0796b65c1dc3ec364c7334737
Package versions are from mid-November ~ early December.
- Alamofire @ 9e0328127dfb801cefe8ac53a13c0c90a7770448
- SwiftNIO @ 076fda15213a9cc1da26b1e3467f1daba2407391
- swift-syntax @ e28671a7650bd54cc381dc21d75e214685f2ac48

There are three main compiler build configurations:

SwiftNoAssert_LLVMNoAssert - Compiled with build-script --no-assertions.
SwiftAssert_LLVMNoAssert - Compiled with build-script --no-assertions --swift-assertions.
SwiftAssert_LLVMAssert - Compiled with build-script --no-assertions --swift-assertions --llvm-assertions.

(--swift-assertions only turns on assertions for the compiler, not the stdlib.)

Turning off Swift's assertions also turns off the SIL verifier. Because of this, the first configuration only has a _NoSILVerify variant, whereas the second and third configurations have an additional _NoSILVerify variant, which is exercised by passing -Xfrontend -sil-verify-none. So there are 5 variants in total. For each variant, we compile SwiftPM packages in debug mode and release mode. (swift build -c debug or swift build -c release).

Everything else (apart from swiftc/LLVM) has assertions turned off due to --no-assertions. I suspect this might be the cause of the differences between the numbers here and in my comment on Dan's post.

Stats

We consider the SwiftNoAssert_LLVMNoAssert_NoSILVerify variant as the baseline, since it's the fastest. For the rest, I've reported times as +X% relative to the baseline, meaning that it takes time = (1 + X/100) * baseline_time.

Package/Config	SwiftAssert_LLVMNoAssert_NoSILVerify	SwiftAssert_LLVMAssert_NoSILVerify	SwiftAssert_LLVMNoAssert	SwiftAssert_LLVMAssert
Alamofire/debug	+10%	+15%	+10%	+15%
SwiftNIO/debug	+12%	+16%	+18%	+23%
SwiftSyntax/debug	+14%	+18%	+21%	+25%
Alamofire/release	+13%	+22%	+25%	+32%
SwiftNIO/release	+16%	+23%	+31%	+38%
SwiftSyntax/release	+13%	+16%	+26%	+29%

To summarize the numbers:

Based on the values in column 1: the overhead of enabling assertions for Swift-only is about +10%~15% to compile times.
Based on the differences column 3 - column 1 and column 4 - column 2: the overhead of enabling the SIL verifier has some variation: it can be negligible, or it can be like +7%, or it can be as high as +15%. I don't know much about the verifier to be able to judge whether this much variation is expected or not.
Based on the differences column 2 - column 1 and column 4 - column 3: the overhead of enabling assertions in LLVM-only is roughly +4%~7% to compile times.

Details

More detailed stats are available here: Swiftc assertions overhead · GitHub

I made the measurements using a couple of small shell scripts:

#!/usr/bin/env bash
# run.sh
for package in "Alamofire" "swift-nio" "swift-syntax"; do
  for buildType in "debug" "release"; do
    hyperfine \
       --prepare "rm -rf $package/.build" \
       --warmup 1 \
       --runs 15 \
       --parameter-list compilerconfig 'SwiftNoAssert_LLVMNoAssert_NoSILVerify,SwiftAssert_LLVMNoAssert_NoSILVerify,SwiftAssert_LLVMNoAssert,SwiftAssert_LLVMAssert_NoSILVerify,SwiftAssert_LLVMAssert' \
       --style full \
       "./compile_package.sh $package $buildType {compilerconfig}"
  done
done

#!/usr/bin/env bash
# compile_package.sh
echo "-c $2 $([[ $3 == *"_NoSILVerify" ]] && echo "-Xswiftc -Xfrontend -Xswiftc -sil-verify-none")" | xargs xcrun /Users/varun/foss-swift-alt/build/$(echo $3 | sed 's|_NoSILVerify||')/toolchain-macosx-x86_64/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/swift build --package-path $1

tbkka · December 17, 2020, 6:19pm

Nice work!

I wonder if this is "good enough" for us to consider enabling these assertions by default? That would help accelerate our work on compiler stability and correctness by getting more accurate feedback about internal compiler problems. We could then iterate to drive down the overhead.

Michael_Gottesman · December 17, 2020, 6:31pm

What is the overhead on the whole source compat suite? Why did you choose those specific projects?

I think it would be good to get significantly more data rather than 3-4 projects.

typesanitizer · December 17, 2020, 6:56pm

These are intended as preliminary measurements, I'm certainly planning on making follow-up measurements with the source compat suite.

The main constraints are machine time and ease of setting up. For machine time, I can make more measurements over the weekend (or I need to get hold of a quiet but powerful machine), but I figured it would be better to share some numbers sooner rather than later.

For ease of setting up: these are all SwiftPM packages, which makes building with a custom compiler easy. Xcode projects are slightly more finicky, and Xcode projects which use SwiftPM packages moreso because they don't support easily swapping out the compiler via SWIFT_EXEC.

dan-zheng · December 18, 2020, 11:22am

Thanks a bunch @typesanitizer for looking into this! The quantitative experiment results are exciting.

Just want to share a personally exciting use case for enabling assertions: autodiff crash understandability. A few months ago on tensorflow branch, macOS toolchains stopped building with assertions enabled for some reason (even though we didn't touch the build preset, which had --assertions).

No assertions made it very difficult for users of differentiable programming in Swift to understand related compiler crashes. The debug workflow was:

Locally build a swift-frontend binary with assertions enabled
Copy it to the release toolchain
Re-run reproducer to see the assertion failure and assess the issue

Unfortunately, only compiler developers can really do this. Users cannot, and it's hard to even write a debugging guide because -Xllvm is disabled when assertions are disabled (swiftc -Xllvm -debug-only=differentiation cannot be used).

So, assertions enabled would be great!

cc @porterchild @Edward_Connell

porterchild · December 18, 2020, 10:39pm

I agree with Dan, getting more feedback from the compiler would certainly speed the process of working around and/or reporting autodiff immaturities.
A very concrete example of an inefficiency this lack of feedback creates is where, many times during my own autodiff usage, I've hit a compiler crash, and with little to go on to fix it, I randomly perturbate my program until it works. I would love to reduce the issue to a small reproducer to report, but that might take hours to do when the surrounding code is large. More compiler guidance would reduce the effort required to make a small reproducer, which means I'm more inclined to try.

Michael_Gottesman · December 18, 2020, 10:48pm

@porterchild in the past the way that we have done this is that we provide snapshots/asserts builds from swift.org. I would suggest that.

porterchild · December 18, 2020, 11:03pm

Sorry, what do you mean by snapshots/asserts builds?

Michael_Gottesman · December 18, 2020, 11:26pm

Solving this issue historically: swift.org asserts snapshots

The way that we have traditionally dealt with this problem is that we provide nightly/release toolchains with assertions from swift.org. These are easy to install (we provide an installer on macOS and tar balls on other platforms) and use (in Xcode there is even UI for selecting it!). If someone reports a bug to us, in the past people have been more than happy to rerun their project with a snapshot which then diagnoses the bug.

The asserts toolchain approach has the benefit that we save a significant amount of compile time for our users but do not lose the ability to have users easily reproduce for us (or for us to run on their project). The only thing that the asserts toolchain approach does not get us are situations where:

The compiler did not miscompile but we hit a bogus assert in the compiler.
The compiler did not miscompile but we hit an assert that shows a broken invariant, but maybe an invariant that does not matter in this case (since if it mattered we would not have miscompiled).
The compiler did miscompile, but the user did not notice it/report it, and it would have triggered an assertion in the compiler.

The question we are balancing here is increasing compile times 10-25% (a big compile time ask from our users) worth eliminating these 3 cases from happening (if I missed any, feel free to correct me!).

In terms of the 3 that I listed, I find 1,2 to be sort of questionable in the sense that I don't know if we should impose the compile time cost on our users for a situation that doesn't miscompile. I find 3 to be more compelling, but also I wonder if the user did not notice the miscompile in their tests/etc, so I don't know if that can happen.

Thoughts

For me to feel really comfortable with asserts running all the time in a naive way, I think we need a way to measure/contain the cost of these assertions in terms of compile time in swift-ci. Gathering compile time data on all of the source compat test suite with/without assertions and showing that all projects are within a time budget would make me comfortable (I am not specifying what that budget would be). Otherwise, I have a feeling that we are going to slowly add more and more of these and no one will pay attention. This could just be a swift-ci job that people can run on their PR to get this data. I imagine we will want to look at this data at at release qualification time (or even have a bot that checks once a week) to make sure we are within budget.
I wonder if we could be more aggressive and get away with doing more extensive verification by being smarter about our assertions. Consider a situation where we added a form of assert that is always compiled in, but are behind a frontend and a driver flag called -verify-all that enabled the asserts in a single frontend job or all frontend jobs respectively. Then,

a. Instead of running it all the time, the user can always just passing in -verify-all to the driver and will get all of the asserts. This eliminates the need for snapshots, but also saves our users compile time!
b. We could also have the driver even when the -verify-all flag is not set rerun a crashing frontend job with the flag -verify-all to create a reproducer.
c. Also in that model, we could set up a mode where we probabilistically run with more asserts enabled. That would have negligible impact on compile time and I imagine would find a bunch of bugs!
d. Additionally as mentioned, this would let us be far more aggressive about the verifications that we are enabling. I would feel comfortable in this scenario running /all/ verifiers. That would be pretty sweet to get that in a reproducer.
I imagine that enabling assertions in SourceKit is a separate discussion from the compiler. @akyrtzi may have more info.

Michael_Gottesman · December 18, 2020, 11:30pm

I am assuming you are talking about swift.org builds (where I believe we have autodiff now)? Or builds built another way? The traditional way that we dealt with this problem on swift.org is by using asserts toolchains that are built nightly (and I believe we also provide them for releases as well). So you would be able to just download the relevant toolchain and work with it.

If this is from internal TF infrastructure, I can't speak to that.

porterchild · December 18, 2020, 11:45pm

I've typically used a toolchain from the S4TF GitHub, because that's been on the leading edge of autodiff functionality. Though it seems like the efforts to upstream autodiff features mean that swift.org should be nearly equal in terms of autodiff now.
Also, the proprietary nature of code I'm working on means it's hard to freely share projects that contain reproducers.

dan-zheng · December 18, 2020, 11:48pm

Yes - all code for differentiable programming in Swift has been upstreamed to main (done for a few months now), where development continues.

About Swift for TensorFlow: S4TF is switching to build using Swift.org - Download Swift toolchains instead of custom ones from tensorflow branch. This effort is in progress and is almost done! (See recent tensorflow/swift-apis PRs about "migrating to standard/stock toolchains".)

This means "de-forking Swift for TensorFlow" for good, which should allow Swift for TensorFlow developers and users to leverage all the tooling and infra for apple/swift:main branch, like Swift.org - Download Swift toolchains and ci.swift.org artifacts (continuous assert snapshots) for all supported platforms.

Michael_Gottesman · December 18, 2020, 11:57pm

Nice! I think it will simplify things which is good (world is complicated enough anyways = p).

dan-zheng · December 19, 2020, 12:19am

Yep!

Thread is off the rails, sorry to derail one further, a new thread should be created - but in the spirit of de-forking, I wonder if SwiftWASM changes are still being upstreamed to main? @Max_Desiatov shared this summary by @kateinoigakukun.

Being able to use all these technologies in standard toolchains would be huge.