Enabling Whole Module Optimizations by default for Release builds

Hi,

Many of us have considered it a long-term goal to make whole-module optimizations the default compilation mode for Release builds. I believe that we are ready and that we should enable WMO by default. WMO solves real problems but comes at a cost. Let’s discuss this and make a decision.

Today, each swift file is compiled and optimized independently. Compiling each file independently makes our compile times relatively fast because multiple files are optimized in parallel on different CPU cores, and only files that have been modified need to be recompiled. Both Debug and Release builds use this technique.

However, the current compilation mode is not that great for the performance of the generated code. File boundaries are optimization barriers. Swift users found that when they develop generic data structures (like Queue<T>) the performance of these data structures was not ideal, because the compiler was not able to specialize the generics into concrete types (turn T into an Int for all uses of Queue<T>). Our recommendation was to either copy the generic data structures into the files that use them or enable WMO[1].

With WMO all of the files in the module are optimized together and the optimizer is free to specialize generics and inline functions across file boundaries. This compilation mode is excellent for performance, but it comes at a cost. Naturally, WMO builds take longer because the compiler can’t parallelize the build on multiple cores. Also, every change in a single files makes the compiler re-optimize the whole program and not only the file what was modified.

We’ve been working to improve our WMO builds in a number of ways. Erik (@eeckstein) found that we spent most of our compile time in code generation optimizations (such as instruction selection and register allocations) inside LLVM, and was able to split the Swift module into multiple units that LLVM can compile in parallel. He also implemented cacheing at the LLVM level that improved the performance of incremental builds. Unrelated to the work on WMO, we also improved the speed of the optimizer by tuning our optimization pipeline and the different optimizations - and in the last three months we improved the overall compile time of optimized builds by ~10%.

Another concern was the increase in code size. In WMO mode we are free to specialize generics and inline code between files. These optimizations can increase the size of the generated program significantly. We also made a huge progress on this front. We were able to reduce the size of the Swift dylib that builds with WMO from 4.6MB in January to 3.7MB today.

We should not enable WMO for Debug builds. We don’t do any major optimizations on Debug builds and compiling things in WMO will only slow the compile times and not provide any speedups. We strive to make Swift Debug builds as fast as scripts (and intend to allow people to use #!/bin/swift), so unjustified reduction in compile time is unacceptable.

To summarize, WMO is critical to the performance of optimized Swift programs. WMO increases code size and compile times but we’ve made excellent progress and we’ll continue to invest in these areas. I believe that considering the inherit tradeoffs, enabling WMO is a good idea.

Please let me know what you think.

Thanks,
Nadav

[1] - Increasing Performance by Reducing Dynamic Dispatch - Swift Blog - Apple Developer

+1
The first thing I do in a new project is always to enable WMO.

My impression/experience is that slow compile times often have to do with
the type checker; I use
-Xfrontend -debug-time-function-bodies
to see in which funcs the type checker spends "too much" time, edit them
and repeat until compile time goes down. The first time I did this I had a
project in which the compile time was about 10 seconds and it went down to
less than a second after I had identified and reformulated ten or so time
consuming parts of my code.

However, especially for bigger project, viewing the time/func-output in
Xcode's Report Navigator is quite far from a pleasant experience.
The contents of the view is painfully slow, and it is not possible to
sort/order/filter the time/func-entries etc.

An improved report of where the compile time is spent (not only time/func
but time/code-location) would be a very valuable tool until compile times
becomes less erratic and expressions like eg this:
let dictionaryOfIntOps: [String: (Int, Int) -> Int] = [
"+": (+),
"-": (-),
"*": (*),
"/": (/),
]
is no longer considered "too complex to be solved in reasonable time"
(commenting out one of the four elements will make it compile).
Also, there are other seemingly simple expressions that are slow but not
too slow / too complex.
And as the sum of a project's all "reasonable times" quite often ends up
being kind of unreasonable, we need a reasonable tool to help us help the
compiler/type-checker (by quickly pointing us to the most time consuming
parts of the code so we can reformulate it).

/Jens

···

On Fri, Mar 11, 2016 at 7:47 PM, Nadav Rotem via swift-dev < swift-dev@swift.org> wrote:

Hi,

Many of us have considered it a long-term goal to make whole-module
optimizations the default compilation mode for Release builds. I believe
that we are ready and that we should enable WMO by default. WMO solves real
problems but comes at a cost. Let’s discuss this and make a decision.

Today, each swift file is compiled and optimized independently. Compiling
each file independently makes our compile times relatively fast because
multiple files are optimized in parallel on different CPU cores, and only
files that have been modified need to be recompiled. Both Debug and Release
builds use this technique.

However, the current compilation mode is not that great for the
performance of the generated code. File boundaries are optimization
barriers. Swift users found that when they develop generic data structures
(like Queue<T>) the performance of these data structures was not ideal,
because the compiler was not able to specialize the generics into concrete
types (turn T into an Int for all uses of Queue<T>). Our recommendation was
to either copy the generic data structures into the files that use them or
enable WMO[1].

With WMO all of the files in the module are optimized together and the
optimizer is free to specialize generics and inline functions across file
boundaries. This compilation mode is excellent for performance, but it
comes at a cost. Naturally, WMO builds take longer because the compiler
can’t parallelize the build on multiple cores. Also, every change in a
single files makes the compiler re-optimize the whole program and not only
the file what was modified.

We’ve been working to improve our WMO builds in a number of ways. Erik
(@eeckstein) found that we spent most of our compile time in code
generation optimizations (such as instruction selection and register
allocations) inside LLVM, and was able to split the Swift module into
multiple units that LLVM can compile in parallel. He also implemented
cacheing at the LLVM level that improved the performance of incremental
builds. Unrelated to the work on WMO, we also improved the speed of the
optimizer by tuning our optimization pipeline and the different
optimizations - and in the last three months we improved the overall
compile time of optimized builds by ~10%.

Another concern was the increase in code size. In WMO mode we are free to
specialize generics and inline code between files. These optimizations can
increase the size of the generated program significantly. We also made a
huge progress on this front. We were able to reduce the size of the Swift
dylib that builds with WMO from 4.6MB in January to 3.7MB today.

We should not enable WMO for Debug builds. We don’t do any major
optimizations on Debug builds and compiling things in WMO will only slow
the compile times and not provide any speedups. We strive to make Swift
Debug builds as fast as scripts (and intend to allow people to use
#!/bin/swift), so unjustified reduction in compile time is unacceptable.

To summarize, WMO is critical to the performance of optimized Swift
programs. WMO increases code size and compile times but we’ve made
excellent progress and we’ll continue to invest in these areas. I believe
that considering the inherit tradeoffs, enabling WMO is a good idea.

Please let me know what you think.

Thanks,
Nadav

[1] - Increasing Performance by Reducing Dynamic Dispatch - Swift Blog - Apple Developer

_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

--
bitCycle AB | Smedjegatan 12 | 742 32 Östhammar | Sweden

Phone: +46-73-753 24 62
E-mail: jens@bitcycle.com

Hey Jens,

I feel your pain here - it’s an area that I’ve gradually been trying to improve for quite some time. We consider these kinds of simple inference problems to be serious type checker bugs, and we’re trying to flush them out wherever possible.

We’ve already made some big improvements for Swift 3. With the edge contraction work I pushed back in January, I think I have some of the right abstractions in place to address problems like the one you call out below (also captured as rdar://problem/22810685 <rdar://problem/22810685>). For example, the expression you provided now compiles without “going exponential” in Swift 3 (though it still requires a type annotation).

I hope to keep improving the status quo, so that -debug-time-function-bodies will eventually be unnecessary, but in the meantime please keep filing bugs. I’m working through these issues one-by-one.

Thanks!
- Joe

···

On Mar 11, 2016, at 1:38 PM, Jens Persson via swift-dev <swift-dev@swift.org> wrote:

+1
The first thing I do in a new project is always to enable WMO.

My impression/experience is that slow compile times often have to do with the type checker; I use
-Xfrontend -debug-time-function-bodies
to see in which funcs the type checker spends "too much" time, edit them and repeat until compile time goes down. The first time I did this I had a project in which the compile time was about 10 seconds and it went down to less than a second after I had identified and reformulated ten or so time consuming parts of my code.

However, especially for bigger project, viewing the time/func-output in Xcode's Report Navigator is quite far from a pleasant experience.
The contents of the view is painfully slow, and it is not possible to sort/order/filter the time/func-entries etc.

An improved report of where the compile time is spent (not only time/func but time/code-location) would be a very valuable tool until compile times becomes less erratic and expressions like eg this:
let dictionaryOfIntOps: [String: (Int, Int) -> Int] = [
"+": (+),
"-": (-),
"*": (*),
"/": (/),
]
is no longer considered "too complex to be solved in reasonable time" (commenting out one of the four elements will make it compile).
Also, there are other seemingly simple expressions that are slow but not too slow / too complex.
And as the sum of a project's all "reasonable times" quite often ends up being kind of unreasonable, we need a reasonable tool to help us help the compiler/type-checker (by quickly pointing us to the most time consuming parts of the code so we can reformulate it).

/Jens

On Fri, Mar 11, 2016 at 7:47 PM, Nadav Rotem via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:
Hi,

Many of us have considered it a long-term goal to make whole-module optimizations the default compilation mode for Release builds. I believe that we are ready and that we should enable WMO by default. WMO solves real problems but comes at a cost. Let’s discuss this and make a decision.

Today, each swift file is compiled and optimized independently. Compiling each file independently makes our compile times relatively fast because multiple files are optimized in parallel on different CPU cores, and only files that have been modified need to be recompiled. Both Debug and Release builds use this technique.

However, the current compilation mode is not that great for the performance of the generated code. File boundaries are optimization barriers. Swift users found that when they develop generic data structures (like Queue<T>) the performance of these data structures was not ideal, because the compiler was not able to specialize the generics into concrete types (turn T into an Int for all uses of Queue<T>). Our recommendation was to either copy the generic data structures into the files that use them or enable WMO[1].

With WMO all of the files in the module are optimized together and the optimizer is free to specialize generics and inline functions across file boundaries. This compilation mode is excellent for performance, but it comes at a cost. Naturally, WMO builds take longer because the compiler can’t parallelize the build on multiple cores. Also, every change in a single files makes the compiler re-optimize the whole program and not only the file what was modified.

We’ve been working to improve our WMO builds in a number of ways. Erik (@eeckstein) found that we spent most of our compile time in code generation optimizations (such as instruction selection and register allocations) inside LLVM, and was able to split the Swift module into multiple units that LLVM can compile in parallel. He also implemented cacheing at the LLVM level that improved the performance of incremental builds. Unrelated to the work on WMO, we also improved the speed of the optimizer by tuning our optimization pipeline and the different optimizations - and in the last three months we improved the overall compile time of optimized builds by ~10%.

Another concern was the increase in code size. In WMO mode we are free to specialize generics and inline code between files. These optimizations can increase the size of the generated program significantly. We also made a huge progress on this front. We were able to reduce the size of the Swift dylib that builds with WMO from 4.6MB in January to 3.7MB today.

We should not enable WMO for Debug builds. We don’t do any major optimizations on Debug builds and compiling things in WMO will only slow the compile times and not provide any speedups. We strive to make Swift Debug builds as fast as scripts (and intend to allow people to use #!/bin/swift), so unjustified reduction in compile time is unacceptable.

To summarize, WMO is critical to the performance of optimized Swift programs. WMO increases code size and compile times but we’ve made excellent progress and we’ll continue to invest in these areas. I believe that considering the inherit tradeoffs, enabling WMO is a good idea.

Please let me know what you think.

Thanks,
Nadav

[1] - Increasing Performance by Reducing Dynamic Dispatch - Swift Blog - Apple Developer <Increasing Performance by Reducing Dynamic Dispatch - Swift Blog - Apple Developer;

_______________________________________________
swift-dev mailing list
swift-dev@swift.org <mailto:swift-dev@swift.org>
https://lists.swift.org/mailman/listinfo/swift-dev

--
bitCycle AB | Smedjegatan 12 | 742 32 Östhammar | Sweden
http://www.bitcycle.com/
Phone: +46-73-753 24 62
E-mail: jens@bitcycle.com <mailto:jens@bitcycle.com>

_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev