I basically like the approach laid out here for bootstrapping. I'm not sure how will it will integrate with the Swift driver in the future or whether splitting compilation units into .sib files is well supported today. For example, sil-opt often fails today when supplied with .sil produced by swift-frontend.
I think calling this "LTO" is a misnomer, as this seems like a feature that's internal to the swift driver, not driven by the linker. Although I do think LLVM thin-LTO is a good archictecture to emulate.
Reading between the lines, here's how I understand this proposal...
Given modules A and B...
Compile A
$ swift-frontend A1.swift A2.swift -whole-module-optimization -emit-module -emit-module-summary -emit-sib -o A.swiftmodule
Output: A.swiftmodule, A.swiftmodule.summary, A.sib
The three output file types are likely all generated during SIL module serialization, but we have the option of deferring the .summary and .sib output in the future if it's useful to run more passes on those.
A.sib must contain the additional SIL function bodies that are not exported for cross-module optimization. It may also contain a copy of exported function bodies, for example, if they have been further optimized after the module was serialized. A.sib is the same file format as .swiftmodule but does not include any AST-level type-information (A.sib is useless on its own). A.sib can be arbitrarily broken down into Ax.sib, Ay.sib, Az.sib either for parallelism or incremental builds.
Compile B
$ swift-frontend B1.swift B2.swift -whole-module-optimization -emit-module -emit-module-summary -emit-sib -o B.swiftmodule
Output: B.swiftmodule, B.swiftmodule.summary, B.sib
$ swift-frontend -merge-module-summary \
A.swiftmodule.summary B.swiftmodule.summary \
-o merged_module.summary
The summary merge step seems unnecessary, but it may save compilation time because each module does not need to "re-merge" the summaries as it imports them.
I'm not sure why the proposal calls this "-cross-module-opt".
Test and debug the SIL optimizer
$ sil-opt A.sib -emit-sil \
-module-summary-path merged-module.summary \
--sil-cross-deadfuncelim
-
Finds A.swiftmodule and B.swiftmodule in the include path
-
I think we currently need to specify the .sib file's parent
.swiftmodule on the command line, but that seems silly. A.sib should
know that it comes from A.swiftmodule
CodeGen A
$ swift-frontend A.sib -c -o A.o -module-summary-path merged-module.summary
- Finds A.swiftmodule and B.swiftmodule in the include path
There seemed to be some confusion regarding the artifacts produced by the compiler. Here's my take on that...
It's useful to separate information that has different dependence information, different lifetime, or needs to be individuated on a command line into separate files.
.swiftmodule: "what a module exports"
-
somewhat analogous to a combined header
-
produced by a single well-defined SIL serialization point in the
optimizer pipeline (prior to dropping any semantics).
-
self-contained, AST, SIL-level declarations, and exported function
body definitions.
-
may depend on information from function bodies that aren't included
(at least as-is today). Ideally we would have a way of recording
those dependencies on .swift files, and/or avoid introducing them, to
support incremental cross-module optimization.
.swiftmodule.summary: "inclusive module summary"
-
augments .swiftmodule with summary information that's inclusive over
the module's implementation
-
could (should?) be embedded within the .swiftmodule, but separating them
allows for a single merged summary file
-
additional source of dependencies on .swift files. It's possible
that updating a .swift file changes the summary but not the
.swiftmodule
-
could potentially be emitted later in the pipeline to provide more
refined summary
.sib/.sil: "SIL-level compilation unit"
-
somewhat analogous to .cpp/.bc./.ll
-
arbitrary subset of SIL function bodies for codegen within a
module that can be merged or split. Never seen by other modules.
-
these may be emitted at any time during SIL optimization for testing and debugging.
-
.sib should (ideally) be isomorphic and interchangeable with .sil files