Faster compilation through parallel module build

I saw in the “What’s New in Xcode” WWDC session that y’all sped up compilation by building a library’s module in parallel with its object files, instead of doing a separate module-merging step. That’s basically what happens in whole-module builds today, and SourceKit has to do something similar, so I always knew it was possible, but I didn’t expect it to be faster, especially in an incremental build. I’m a little ashamed of having not tested it while at Apple. (Of course, maybe it was slower and not worth it in the past.) Kudos to y’all inside or outside Apple who made this happen!

(Just for my own edification, I assume this is not done in [edit: non-WMO] optimized builds? Cause how would the inlinable SIL get in there?)

12 Likes

The second part of Demystify parallelization in Xcode builds has more details on eager emission of modules, and eager linking.

I don't know if there's an answer to your question in the video.

Thanks. There is not, but that is the better video to cite! It sounds like the existence of non-WMO-but-optimized is downplayed, which matches what Xcode's default project template has been for years.

I guess the other option besides giving up on eager module emission is to have the eager module emission generate inlinable code. (And the third option is that inlinable code is broken in this scenario, but that seems unlikely—there are tests for it in the compiler repo!)

The build performance speedup is from when you have multiple modules in a dependency chain. Separating the module generation and the object generation allows the downstream modules to start building without waiting for the object generation to be done.

You are right that when certain optimization mode is enabled, we cannot emit modules separately because we need absolute consistency between modules and objects. cc: @Erik_Eckstein

1 Like

To add more context, there’s now two new variants to the skip-noninlinable-function-bodies build optimization. One is for indexing and skips all function bodies, it is generally combined with allowing building modules with errors. And one is for emit-module-separately, it takes advantage of lazy function body parsing and preserves nested type information for LLDB.

With emit-module-separately we basically get the same result as an installAPI build on a local machine with a few more function bodies in the swiftmodule. It ends up using more CPU time than merge-module but it’s more parallelizable. Emit-module-separately is also more reliable than merge-module as it uses most of the same compilation path as a WMO build and just skips a lot of work.

As Xi mentioned, this optimization is currently disabled in WMO builds where emit-module-separately doubles some work but unblocks dependents earlier. At this moment, you can give it a try by passing down the "-disable-cmo" flag to the driver.

4 Likes