Precompiling Swift-compatible .pcm files for C/Objective-C dependencies

I'm trying to improve distributed build performance of Swift with Bazel, where we often have large dependency graphs with many Swift and Objective-C modules. Since the implicit module cache can't be shared across remote machines, one improvement I'd like to make is to emit precompiled modules for the (Obj-)C dependencies with -Xclang -fmodules-embed-all-files and propagate those up the graph instead of using the text module maps, so that ClangImporter doesn't have to keep reparsing the transitive closure of the headers imported by a Swift module. (And also so our build farm doesn't have to copy that large number of headers from source control, which is currently also a bottleneck.)

Since .pcm files are not stable between compiler versions, and since swiftc and clang in the same toolchain aren't even guaranteed to be built from the same version, I can't use our existing Obj-C compilation to generate these modules.

Instead, I've hacked together a totally-not-ready-for-review-yet change that adds a driver invocation (swift -generate-pcm) that uses ClangImporter to emit .pcms that should be compatible, since the same compiler and invocation that emits them will consume them. Then, when I compile the Swift code that depends on an Objective-C module, I precompile it and I feed it in with -Xcc -fmodule-file=path/to/module.pcm.

For very trivial cases, that seems to be working. But when I try to precompile something that does #import <UIKit/UIKit.h>, I'm seeing errors like this:

While building module 'REDACTED':
While building module 'UIKit' imported from REDACTED:15:
While building module 'CoreImage' imported from /Applications/Xcode-beta.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator13.0.sdk/System/Library/Frameworks/UIKit.framework/Headers/UIColor.h:13:
While building module 'CoreVideo' imported from /Applications/Xcode-beta.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator13.0.sdk/System/Library/Frameworks/CoreImage.framework/Headers/CIImage.h:10:
In file included from <module-includes>:1:
In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator13.0.sdk/System/Library/Frameworks/CoreVideo.framework/Headers/CoreVideo.h:20:
In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator13.0.sdk/System/Library/Frameworks/CoreVideo.framework/Headers/CVReturn.h:21:
/Applications/Xcode-beta.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator13.0.sdk/System/Library/Frameworks/CoreVideo.framework/Headers/CVBase.h:175:9:
    error: declaration of 'uint64_t' must be imported from module 'Darwin.C.stdint' before it is required
typedef uint64_t CVOptionFlags;

Sure enough, if I look in CoreVideo/CVBase.h, there's no direct import of <stdint.h> to be found; presumably it's coming in from a transitive inclusion. But what I can't figure out is, why wouldn't this error also occur when ClangImporter is building implicit modules for the module cache under a more traditional compilation that uses text module maps? What's different about the precompiled case?

I don't have a lot of expertise in the lower-level details of Clang modules, so if anyone has any insight, I'd love to hear it!

3 Likes

CVBase.h imports <CoreFoundation/CFBase.h>, which imports <stdint.h>, and CoreFoundation's module map does have export *. So I think you've found a modules bug on the Clang side, and I'd expect you to be able to reproduce it with just Clang. I don't know how heavily tested -fmodule-file= is.

By the way, Swift sticks extra information in its PCMs, so while Clang could in theory use the same PCMs as Swift, the reverse is not true.

1 Like

With the following module map and header files:

module foo {
  header "foo.h"
}
#import <UIKit/UIKit.h>

The following command line successfully emits a .pcm file:

clang -cc1 -triple x86_64-apple-ios10.0-simulator -fblocks -x objective-c \
    -isysroot $(xcrun -sdk iphonesimulator -show-sdk-path) \
    -emit-module -fmodules -fmodule-name=foo -fmodules-embed-all-files \
    module.modulemap -o foo.pcm

Something else is odd here though. The emitted file is 16MB and (based on looking at the hex dump and the output from -module-file-info) appears to contain the ASTs for UIKit and everything that it imported (transitively), in addition to my own foo.h. Is this expected even though I'm not re-exporting anything myself? (Removing -fmodules-embed-all-files has no effect here; it's still listing all of the transitive framework headers as input files to the module.)

clang -cc1 is like swift -frontend: it's an unstable interface that only expects to be called from the Clang driver. In this case there's a bunch of implicit args the driver would normally add, like -fimplicit-module-maps, that would change things a lot. I don't think any solution that relies on manually constructing a -cc1 invocation from scratch is going to be a lasting one.

:-/ I think you're going to want to get more Clang people to look at this, in particular Richard Smith (also at Google) or whoever he recommends. As far as I know we mostly haven't looked into this kind of use case for Xcode, so the usual Apple modules people aren't as likely to be helpful.

1 Like

Thanks for the insight; I'll reach out to Richard Smith (he and I have chatted in the past).

My attempt with -cc1 was just to try to repro the issue with standalone Clang and it doesn't look like there's a way to have a driver invocation that does -emit-module, but you make a good point about possibly missing some args that the driver would pass. I wouldn't be building a long-term solution around this flag though—if I can get this to work (and that's a big "if"), I'd like to add a flag like -generate-pcm to the Swift driver and use that to have ClangImporter construct the CompilerInstance and do the work. Would that be something the Swift team would be willing to accept?


To give some context of the scale we're dealing with, thanks to Bazel's practice of having fine-grained libraries, one of the Swift modules I was benchmarking has a transitive closure with ~10,000 .h files (spanning a few hundred modules). This gives us two main bottlenecks:

  1. -driver-time-compilation shows that ≥ 90% of the time (45-60s) is spent in Name Binding, so I'm hoping to see a boost by using .pcms to avoid reparsing so many headers.
  2. Our build farm isn't optimized well for actions that have to copy of a huge number of small inputs—we spend almost the same amount of time setting up the remote working area as we do compiling. So if I can ship just .pcms instead of .h files whenever possible, that time will definitely go down.

-generate-pcm sounds non-wonderful but practical. It's a little funny if Swift gets such a thing before Clang does!

1 Like

Does this happen to be Objective-C++ code? I hit a very similar issue with clang-scan-deps constructing an explicit module build from an implicit build of llvm+clang. The issue is that both Darwin and libc++ have a stdint.h, and libc++'s forwards to Darwin's. There's a hack to make this work without causing a circular dependency, but it breaks in an explicit build for unknown reasons. It would be great if someone could track this down.

It's been a while since I gave this thread an update, since I've only recently been able to dive back into this problem.

No, but what you're describing sounds extremely similar to the problem that we discovered, after some colleagues and I sat down, added some logging, and stared at it for a while. In my case, there was a subtle difference in the order of include search paths in the compiler invocation I was constructing to emit the explicit PCM vs. the one Swift used when building the implicit PCMs, and that was causing a __has_include_next check in one of the standard headers to trigger differently. In one case it would pick up stdint.h from Clang, and in the other case it would pick it up from the platform SDK. This was enough to cause things to go south and give the error posted above.

Once I fixed that discrepancy, I was able to get a PCM to successfully be generated, and feeding that as an input into swiftc appears to have worked as well, at least for a couple simple test cases. I still need to throw some of our real code at it and see how things fare.

But with that main issue resolved, I'm going to get back to working on getting this in good enough shape to get a PR out for review in the near future.

PR for this feature is now up: https://github.com/apple/swift/pull/28107

For those not clicking "refresh" constantly in anticipation, @allevato's pull request has been merged. I'm hoping we can build out this functionality to handle the similar .swiftinterface -> .swiftmodule generation and also up-frontend discovery of module dependencies so the Swift driver can schedule the builds itself.

Doug

2 Likes