Improving Path Remappings for ExtraClangArgs/SwiftASTContext

DavidGoldman · July 11, 2019, 8:31pm

Background:

SwiftASTContext when parsing swiftmodules currently applies the working-directory flag if found in the ExtraClangOptions (see here).

This works well in most use cases, but in the case of distributed builds - for distributed builds (on remote machines), the paths may be invalid (as they're a path on the remote machine, not the local one).

Problem:

For Bazel, which supports distributed builds, we pass relative paths for all compilation inputs as well as a -fdebug-prefix-map=<execution root>=. to make all embedded paths relative. This works well, except for the folowing:

SwiftASTContext doesn't respect any previous -fdebug-prefix-map= in the extra args, only remappings specified from the dSYMs/target source map.
The SearchPaths embedded inside the swift modules are also absolute paths (therefore also remote paths); I'm not sure if this is even consumed (or remapped) by SwiftASTContext or at all.

I'm not sure about the second point, but due to the first point, distributed Swift builds are failing to load modules since the paths are absolute paths for the remote machine instead of the local one.

I think the best way to fix this would be to add -fdebug-prefix-map= support to SwiftASTContext's ClangArgs handling, but I'm not sure. If the second point is an issue as well, we'll also need to fix up those.

Any ideas @Adrian_Prantl?

allevato · July 11, 2019, 9:26pm

This was an issue I called out in the description of my original PR to implement -debug-prefix-map. We (Bazel) currently pass -Xfrontend -serialize-debugging-options to swiftc, which is what causes the search paths and ClangImporter flags to be written into the .swiftmodule file. (Oddly, these "debug" options also affect search path behavior during compilation: SR-7845.)

Unfortunately, the discussion at the time led us to defer remapping those flags because we would have had to just run through the raw command line text and make sure to handle a number of combinations like -I foo vs. -Ifoo, -fmodule-map-file foo vs -fmodule-map-file=foo, and so on.

Later on, however, essentially the same command line manipulating logic had to be added to lldb anyway: https://github.com/apple/swift-lldb/blob/stable/source/Symbol/SwiftASTContext.cpp#L1470-L1493

As I see it, there are a couple options that get us further along the right path:

Figure out how to get debugging working without passing -Xfrontend -serialize-debugging-options during compilation.
Add the same flag remapping logic from lldb to the compiler itself so that it applies to .swiftmodule files, not just DI.

But I'm not sure what unintended consequences might come out of doing one or both of those.

Adrian_Prantl · August 10, 2019, 12:40am

Oddly, these "debug" options also affect search path behavior during compilation.

Yes, that is intentional to enforce a "if you can build it, you can debug it" invariant.

I can see how this rule can make a distributed build system difficult, since you want to rewrite the serialized search paths into a normalized form that is then customized during debug time to point to the actual location on disk.

Without changing any code in the compiler, I think you could make this work today by

not passing -serialize-debugging-options to the Swift compiler
customizing LLDB's .lldbinit to use settings set -- target.swift-extra-clang-flags -I/my/local/includes

This is not awesome, since you can't neatly embed settings into a .dSYM (I think) like you can a remapping dictionary, but it should be a suitable workaround. We should also discuss how a better solution should look like.

Keith · August 12, 2019, 6:22am

Are there other side effects on debugging if we disable this?

Adrian_Prantl · August 12, 2019, 4:07pm

CC @jrose. The primary sideeffect of not passing -serialize-debugging-options should be that you need to manually communicate all search paths and other Clang flags to LLDB. Similarly, because these flags are also used during compilation, you'll need to manually pass all Clang flags needed to import a Swift module using a Clang module that needs those flags to the compiler.

allevato · August 13, 2019, 12:53am

Thanks for the info, @Adrian_Prantl! I've been hoping to be able to stop passing -serialize-debugging-options for a while because I think/hope it's the last source of nondeterministic info (absolute paths) from our remote builds.

The Clang flags shouldn't be an issue for Bazel; we already propagate any required header search paths, module maps, and preprocessor defines up the build graph so they get explicitly passed to upstream compilation actions anyway. (In fact, we only pass -serialize-debugging-options for debug builds, not release builds, so if we were relying on those flags, our release builds simply wouldn't work.)

@DavidGoldman is better qualified to comment on what would be involved in getting the necessary flags and paths fed into LLDB.

DavidGoldman · August 13, 2019, 2:48pm

If we stop passing the flag, it will then be required for both remote builds and local builds, correct? In an ideal world we could do the following:

Remote builds (linking remotely): enable dSYMs and embed this information into the dSYM bundle so lldb knows how to load it
Local builds (linking locally): perhaps leave the flag enabled so the information is still embedded in the binary. Alternatively if we had a similar solution to dSYM plists here, we could do that as well. Does having an information only dSYM bundle seem reasonable (e.g. no debug symbols, just remapping and include paths)?
- The problem with .lldbinit is that there's no way to customize this inside of Xcode per-target or per-project (that I know of) and include paths could definitely change per target

Adrian_Prantl · August 13, 2019, 3:57pm

Alternatively if we had a similar solution to dSYM plists here, we could do that as well.

Why not also build .dSYM bundles locally and have a post-processing script adjust the plist accordingly?

Adrian_Prantl · August 13, 2019, 4:00pm

If the reason for not running dsymutil locally is latency of incremental builds, I think it would be reasonable feature request for LLDB to support something that has the effect of a .dSYM plist with a non-dSYM build. There are many ways to implement this, a non-debug-info .dSYM like you proposed being one of them.

DavidGoldman · August 13, 2019, 6:17pm

As you mentioned below our main concern is incremental build speeds, otherwise we'd always be using dSYMs.

That seems reasonable. What sort of settings would we need to provide in the bundle/plist? Just the clang arguments?

DavidGoldman · August 13, 2019, 8:20pm

As another option, how reasonable does providing a way to remap arguments in -serialize-debugging-options? We'd want to normalize them when they're embedded (e.g. by mapping $PWD=. for all paths) and then tell lldb where to find them (remap ./=BUILDROOT).

ob1 · August 13, 2019, 11:24pm

The problem with .lldbinit is that there's no way to customize this inside of Xcode per-target or per-project (that I know of) and include paths could definitely change per target

@DavidGoldman in Xcode 11 you can customize it per-scheme.

defaults write com.apple.dtXcode IDEDebuggerFeatureSetting 12

brings up a dialog in Xcode where you can enter custom lldb commands.

Keith · August 15, 2019, 4:52am

I think you mean dt.Xcode

Keith · August 15, 2019, 5:23am

It seems like this feature causes Xcode to crash 100% of the time for me (FB7032504), but I'm super excited about it!

jrose · August 15, 2019, 5:13pm

-serialize-debugging-options is a hack that @Adrian_Prantl and I have discussed in the past and haven't come up with a better answer for.

The problem:

Running expressions in Swift uses full AST information
Loading Swift ASTs requires having all their dependencies
Where do those dependencies come from?

-serialize-debugging-options says "here, let me splat in all the search paths you were using". That's vaguely sensible for local builds, somewhat questionable for remote builds…but having path remapping would work. (You'd have to deal with the compiler trying to use those search paths too, because we don't want something that can be compiled but not debugged, but it's doable.)

The trouble is that -serialize-debugging-options isn't just search paths that Swift understands; for (arguably questionable) reasons, it also includes the Clang configuration options passed down by Xcode or other build systems, like -Xcc -DVERY_IMPORTANT_CLANG_MACRO=2. And for (furtherly questionable reasons) some of those options are also search paths. I'm very hesitant to try to detect which options represent search paths, although we've already done a little of that as a hack to detect settings that don't make sense for clients. (Are you as unhappy with this as I am yet?)

@Adrian_Prantl and I, along with other Apple LLDB folks, have discussed what to do about this, and there's been a few options:

Teach Swift about more kinds of search path, so that Xcode and other build systems don't have to use -Xcc and pass them opaquely.
Have Swift do some kind of search path mapping on -Xcc options, as mentioned.
Stop relying on search paths altogether; if LLDB can reconstruct information from DWARF, it doesn't need to find the original headers. @Adrian_Prantl has been working on this again recently but it's a large effort with a fair number of unknowns—the Swift deserialization logic still doesn't always know how to recover when a type can't be loaded or is missing something it had before (like a protocol conformance).

I would love to ditch -serialize-debugging-options entirely. The fact that the compiler respects it too for search paths causes all sorts of issues because Clang isn't designed to have search paths added on the fly. It's also weird that all search paths are taken into account but not all Clang options (because search paths can be appended but Clang options need to be set up once). We just haven't come up with something to use instead. Maybe it'll be the DWARF thing, though.

jrose · August 15, 2019, 5:19pm

To be clear, I'm not against path remapping for serialized search paths if that's the way to go. I'll let you all decide that. But I wanted to provide some of the context for this option and why I dislike depending on it.

Adrian_Prantl · August 16, 2019, 3:58pm

if LLDB can reconstruct information from DWARF, it doesn't need to find the original headers.

To manage the expectations a bit: The mechanism to import Clang modules into Swift from DWARF is not meant to be a replacement for importing Clang modules from source, but as a secondary fallback that is more reliable, but less feature-rich.

This is analogous to debugging C/C++/Objective-C programs with LLDB, where DWARF debug info alone gives you enough information to inspect the state of the program, but if Clang modules can be loaded from source, additionally you also get access to macros, templates and types that weren't used bt the program in the expression evaluator.

DavidGoldman · August 21, 2019, 11:17pm

With this in mind I believe we'll need a solution for importing Clang modules from source in order to have a full debugging experience.

Quick recap from above, there are the following options:

No longer pass -serialize-debugging-options. This means that we'll need to inform lldb of the proper include flags via settings set -- target.swift-extra-clang-flags -I/my/local/includes. This will only work cleanly if we are able to set per-scheme lldb flags inside of Xcode, which appears to be functionality that may be added in Xcode 11. Any idea if this is prioritized? While this has the side-effect of a limited debugging experience when building locally without any lldb settings, it might be feasible in the near future if Xcode supports the per-scheme settings.
Keep passing -serialize-debugging-options, but modify the existing lldb and compiler to respect -fdebug-prefix-map for all options serialized and loaded. For us this would involve making the paths relative to our build directory and then remapping them to the proper build path. This is a bit better than the above solution since we'd have the same build path for all targets in a project.
Stop passing -serialize-debugging-options and allow dSYMs to embed this information (compiler flags). For sake of incremental build speed, add the ability to have some sort of dSYMs-without-symbols, which only contain plists for remappings/compiler flags.

Which approach do you think seems the most reasonable? I'm leaning towards the last option as a long-term fix but the first option could work in the short-term.

Adrian_Prantl · August 24, 2019, 12:22am

I see Option 1 as a reasonable short-term solution to unblock you, but we should strive for something better.

Would Option 2 even work for distributed builds?
When two users that compile the same file, but in different local paths, we still want the distributed build system to produce the exact same binary, while each user keeps their customized remapping dictionary locally.

User A:
swiftc -c /Users/a/proj/mod.swift -fdebug-prefix-map=/Users/a/proj=$SRC_ROOT -fdebug-prefix-map=/Users/a/build=$BUILD_ROOT -fdebug-prefix-map=/Xcode/.../=$SDK_ROOT

User B:
swiftc -c /Users/b/proj/mod.swift -fdebug-prefix-map=/Users/b/proj=$SRC_ROOT -debug-prefix-map=/tmp=$BUILD_ROOT

We can't serialize -debug-prefix-map in the module, as it is user-specific. So the serialized paths would contain references to $SRC_ROOT and $BUILD_ROOT. That means that when a user downloads the remotely compiled module and wants to import it, they need to now let the Swift compiler know about the inverse debug prefix map, hoping that the mapping is bijective. Otherwise the Swift compiler won't be able to find the module's dependencies on the user's machines. If we require the debug prefix map to have absolute paths on the LHS and unique non-filesystem markers on the RHS, I suppose we could use the same map for this.
This approach would clearly benefit from some build system support.

Option 3 has a very straightforward mental model, but either also needs build system support, or is less friendly to the user, since the user now needs to pass the correct include paths for the module and all of its dependencies. It wouldn't be a usability regression from Clang though, where it is also expected to pass include paths for a header and all of the header's dependencies to the compiler.
Having a light-weight dSYM could be useful for various applications, but this needs to be designed properly. I'll think some more about it.

Dave_Lee · August 27, 2019, 3:48am

I just found now that this is a Swift 5.1 setting only

I am at the point where I think I have things working with Swift modules, using a combination of -no-serialize-debug-options and target.swift-framework-search-paths. But now I get an error that lldb can't load one of our third party static .frameworks. It has a modulemap, and I have tried to use target.clang-module-search-paths, but that hasn't worked for me.

Is there a setting I can use to control how lldb finds module maps like this (for swift)?