[RFC] Conditional Runtime Records Revamped

Hi all,

I'd like to socialize a patch I have that re-implements -conditional-runtime-records in a way I believe brings the feature closer to being safe for general adoption.

My goals with this implementation are for the feature to be effective and safe regardless of the user's toolchain configuration (ie. no-LTO, full-LTO, thin-LTO). I also hope that the LLVM changes required solidify the idea, and we can upstream those changes to LLVM, perhaps other languages that emit runtime metadata can make use of it.

Background

For those unfamiliar, the feature proposes a new type of dead stripping.

Consider the example of a protocol conformance record, an entity only used by the runtime that relates a class Foo and a protocol Bar. We won't find direct usages of the protocol conformance in our program, so it must "artificially" be kept alive in some manner. However, if either class Foo or protocol Bar are stripped from our program, then the conformance record is also useless and should be stripped.

Conditional runtime records proposes a way to express a runtime record's eligibility for dead stripping based on the liveness of a set of other entities. Prior to my changes, this was expressed using LLVM metadata:

!metadata = !{ptr @foo_bar_conformance, i32 1, !{ptr @foo, ptr @bar}}

This provides additional flexibility in that it allows for a more "semantic" liveness graph. That is foo, bar, and their conformance record need not point at one another, just to appease the liveness graph.

Problems

These are the issues I've come across with the current design, that I'd like to address:

1.) The liveness of a particular "version" of a dependency foo may not be representative of the liveness of foo in the entire program

The metadata points to dependencies in LLVM IR. The dependencies themselves are LLVM IR global variables, but come in many different forms. They can be defined, or declared, and have different linkages.

Depending on the nature of the dependency, we need to interpret its liveness differently, often more conservatively.

Take a declaration as an example. An unused declaration may be stripped from a module, but that does not mean the definition is unused in the program. Consider for example a protocol defined in the standard library.

Even if we have the definition in our module, that may not be the "prevailing" definition. A definition with link_once* linkage, may be dead in one module, but live in another.

Similarly, available_externally definitions copied into multiple modules in thin-LTO, but always dead stripped. That doesn't mean some copy of that entity isn't live in the program, as the linkage name suggests.

This problem gets worse the less "monolothic" your build is. That is if your entire program is compiled in one binary using mono-LTO with no external dependencies, this isn't an issue, but that isn't a realistic scenario.

2. LLVM metadata is fragile

As described, we need to know some information about a dependency (ie. linkage) in order to safely interpret its liveness. LLVM metadata does not keep the dependencies alive. They can be stripped at any point during the opt pipeline, leaving null values behind in the metadata. A null value dependency has to be treated as live, because it could represent a stripped declaration for example.

I tried at some point to audit the opt pipeline and make sure that if null values are added to the metadata, it is because the prevailing definition of some dependency was dead stripped. This quickly didn't scale, there are too many passes, and a new pass can be added at any time.

3. Conservative correctness comes at an efficiency cost

The less monolithic our build, the less likely the prevailing definition of a dependency resides with the unit of analysis, and the more likely we have to be overly conservative. By unit of analysis, I mean the piece of code we can see and analyze.

This means for example, thin-LTO far less effective than mono-LTO in the current design.

New implementation

In re-implementing the system, I have made the following changes:

  • Emit the stripping conditions as real LLVM globals in a special section, similar to llvm.used, instead of as metadata. These globals are also stored in llvm.compiler.used.
  • Serialize the stripping conditions into module summaries, and consider them in thin-link's liveness analysis to support thin-LTO
  • Teach GlobalDCE to interpret the stripping conditions if run after full-LTO merge
  • Teach LLD to interpret the data and dead strip accordingly
  • Split the Swift frontend behavior into two flags
    • The existing flag, and another safe version, which avoids conditionally dead stripping some globals I believe may be required for Objective-C interop

By emitting real LLVM globals, we don't have the metadata invalidation problems. Furthermore, the data can be optionally interpreted by a linker to dead strip with visibility of the entire binary image.

Except for blessed IR transformations that know how to interpret them, these globals will be preserved until the linker. At that point, they will be automatically dead stripped by any linker. This means linker support is not required, but can help increase efficiency.

As a demonstration, I have implemented the stripping in LLD for MachO, simply because it is in tree and ld64 doesn't seem to accept patches, but I'm happy to extend it to ELF/COFF, or another linker, if it helps upstream to LLVM.

By adding thin-link and linker support, the optimization can be safe and effective in no-LTO or thin-LTO settings, which was not the case before.

Conclusion

I think the finer details of the implementation might be better discussed in the following PRs, but I wanted to discuss the higher level approach here, in case anyone has any feedback or thoughts.

cc: @kubamracek @Joe_Groff @elsh

15 Likes