Globally Optimized Build Parallelism?

Dave_Lee · December 18, 2019, 5:32am

This is a post about swiftc and build system performance characteristics.

Are there any resources on how to manage and optimize available hardware, where there's competing parallelism – in the build system (parallel build targets) and in swiftc (parallel frontend jobs)? In particular, I'm currently interested in developer builds, where -enable-batch-mode is being used.

Two high level questions I have are:

How can available cores be allocated across both parallel build actions and swift frontend jobs – in order to achieve optimal total build times?
Possibly as a subquestion, how can build times of individual modules be optimized by identifying the optimal amount of batching for that module/source files?

I'm just starting to investigate, but here's a list of things that have surprised me:

Xcode appears to always use -j8 (on my machine), even when building multiple targets in parallel, which seems like it's not doing any global allocation of cpu
Possibly due lack of global parallelism, this bug report [SR-11632] Higher maximum jobs (-j) lead to longer build time · Issue #54043 · apple/swift · GitHub shows builds can be faster by using less swiftc parallelism (smaller -j)
Conversely, some targets in my work's project build faster by applying -j8 where it was not previously being used
With Xcode always passing -j8 on my machine, that means modules of 8 (or less) files are not actually being batched, there are 8 frontends, same jobs that are used in single file mode
There's a long interesting comment in Complation.cpp that reasons about the batch size calculation, but it's unclear how valid that remains today, and also how to recalculate the numbers specific to your own build
I've always thought of -j as being used to add/increase parallelism, but with swiftc it's also an upper limit on parallelism, and so for modules with a lot of files, using -j8 can actually decrease parallelism (by increasing batch size)

Is this a solved problem? Is this a hard problem? Does anyone have relevant docs/blogs/resources/papers about this?

thanks!