incremental compilation


(Erik Eckstein) #1

I'd like to give some additional information about my recent commit https://github.com/apple/swift/commit/aaaf36e83521f153ba4b0720795efe4980d9b124
The idea is to run the swiftc's llvm-pipeline only if the llvm IR (the output of IRGen) did change. The llvm pipeline takes about 45% to 60% of total compilation time.
It's done by storing a MD5 hash of the IR in the generated object files. Specifically, on macho-o the hash is stored in a special section in the __LLVM segment, which is ignored by the linker.

The intention is to speed-up whole module compilation. We are already running the llvm part multi-threaded, which gives us about 30% speedup with 4 cores. Incremental gives us another 15% to 20%, assuming only a "small" thing changed in one of many files.
Of course the incremental speed-up totally depends on what changes are done on the sources. In worst case a small change might trigger a complete re-compilation e.g. if it's in an often inlined function.

Here are some numbers for three different modules: a 64 file module which quite large files, a 500 file module with smaller files and the Adventure project. The percentages are similar for all three modules.

A nice side-effect of incremental compilation is that it works for every type of compilation which produces an object file (not only for multi-threaded wmo). For example if you make changes in the compiler which do not affect code generation, then a library build (ninja swift-stdlib) is faster by 60%.

One additional note about the hash. It also includes the full compiler version, which contains e.g. the git SHAs (in a development build). So every time the branch head changes, everything is recompiled.
The hash also includes some option settings, like -disable-llvm-optzns.
There is only one situation where you should think of incremental compilation: if you make some local changes in swiftc's llvm-pipeline you should explicitly clean all compiled libraries, etc.

Erik