Lldb is slow to resolve local vars

Xcode 11.6 and 12.0 beta 5 still have slow lldb problem for us.

I took another sample while Xcode is spinning: lldb_20200824.txt — Яндекс.Диск

I can see that for each clang::FileManager::getStatValue it calls clang::FileSystemStatCache::get, which calls llvm::vfs::RedirectingFileSystem::status, which calls RedirectingFileSystem(RedirectingFileSystem(RedirectingFileSystem(RedirectingFileSystem(....)))) and so on, and eventually it leads to RealFileSystem. This can be seen on line 86 on the sample above.

Could it be that stat calls are being slowed down because of this nested RedirectingFileSystem?
Why might it create such deep RedirectingFileSystem hierarchy?

It appears that HEADER_SEARCH_PATHS affects lldb perf. CocoaPods puts there all paths it knows about, and that list gets quite big. I've removed almost everything (159 paths -> 2 paths).

Results are amazing:

  • cold start (after killing rpc server): minutes -> 20 seconds.
  • hot start: 34 seconds -> 12 seconds.
1 Like

This is just speculation, but if you see stat() showing up in sample and reducing the number of search paths improves the performance, then I would guess it is really the file system that is slow. The fewer search paths we have, the fewer file system lookups we need to do when searching for modules. Do you might be able to create a synthetic benchmark for this with a lot of include paths that reproduces the performance problem?

Also CC'ing @JDevlieghere who knows Clang's file system class better than I do.

Does CocoaPods support generating a header map? From a quick search it doesn't look like it. Header maps are an optimization over header search paths. Long search path lists are a build system "smell", and should be avoided for compile perf, and it seems lldb perf too. If header maps can't easily be done with CocoaPods, vfsoverlays another option.

Header maps wouldn't necessarily reduce the number of stats. Looking through a couple of Xcode projects that I have lying around many contained only relative substitutions of the form Header.h -> Dir/Header.h, which wouldn't help reducing the number of places to check for a header. I suppose an absolute substitution might work.

That said, I would really like to get my hands on a reproducer for this, because this could mean that anything from a performance/caching issue in Clang, to a performance issue in the kernel / file system itself, to a bug in LLDB's usage of Clang APIs.

It's possible that this might be related to Lldb performance with clang modules and Swift which was recently fixed. Perhaps the duplicated arguments were causing extra stat calls (not sure if Clang will deduplicate the flags). Likely worth testing with a Swift development snapshot from here.

I took two attempts to create a project to reproduce this perf issue. First with Swift PM, where I added Header Search paths via flags in Package.swift. I took all HSP from main app, put it into SPM, but lldb works fine.

Then I tried to do the same without SPM, manually added this gigantic list to some project, lldb was still fast.

So I guess this is just our story. We use public CocoaPods libraries and quite a few local development pods. Perhaps the way how CocoaPods configures the project file somehow leads to slowness in lldb.

No artificial repro for you. :frowning:

Hitting the exact same issue here as well (have to wait ~15 minutes for lldb to resolve variables on swift code initially, and lldb is fast after the initial wait).

A little bit about the project I am running attaching LLDB to :

  1. A mega app with objc modules and swift modules interdependencies
  2. lldb on swift codes are initially very slow due to it spending time to populate clang module cache dir (2GB in size)
  3. We do pass a lot of .modulemap and .hmap files as extra-clang-arguments to lldb

I am also attaching the time profiler trace : trace file

We also had a Crowdstrike Falcon corporate security tooling, and I have added the local clang module cache paths into the exclusion patterns, but without any signifcant improvement.

1 Like

Have you sent feedback? I'd be happy to help investigate here, or via feedback.

Hi @Dave_Lee I would prefer continuing the efforts here for more exposure to the other devs experiencing the same issue :)

Let me know what I could share for you to diagnose the issue.

1 Like

Also created Feedback FB13568762

If you are experiencing a slow first expression / p / po in LLDB that is caused by implicit Clang module imports (basically the symptoms discussed in this thread) I would strongly encourage you to experiment with converting the project to explicitly built modules.

LLDB needs to compile implicit clang modules from source, before it caches them in its Clang module cache. This can be quite slow.

In contrast, LLDB can import Clang modules explicitly built by the build system directly, without needing to recompile them from source code. By skipping the module compilation in the debugger, this can be much faster.

But that's not all: People with complex project setups might also have run into Clang module compilation errors in LLDB that were caused by conflicting search paths. Because module compilation is skipped, enabling explicit modules also bypasses this entire class of problems.

For more details about explicit modules see Demystify explicitly built modules - WWDC24 - Videos - Apple Developer

You'll need a recent nightly 6.0 toolchain from swift.org (or Xcode 16 beta 1) to try this out.

7 Likes

In contrast, LLDB can import Clang modules explicitly built by the build system directly, without needing to recompile them from source code. By skipping the module compilation in the debugger, this can be much faster.

Migrating to an explicitly built module system is part of our plan and we are very much looking forward to it, but it seems challenging given the build time concerns(Build times regression with Explicitly Built Modules) and the current WIP state of its support in the Bazel ecosystem. With these challenges in mind, I’m curious if some of the stages or optimizations performed in the explicit module system - particularly those that enhance debugger performance - could be adapted or leveraged within the implicit module build system, addressing developer immediate concerns as well as giving us some buffer to migrate to explicitly built modules feature to get other benefits.

For instance, since .o files in the implicit system already reference .pcm files,

  • is there a way for LLDB to utilize these .pcm files directly instead of recompiling modules from source?
  • Alternatively, could the build system emit .pcm files into a shared cache that LLDB can access, (assuming its not happening today or for some reason the cache is being invalidated for some reason, if yes, any tooling that can help identify why the cache is being invalidated would be valuable)
  • Could the work that LLDB performs to build the expression context across debug sessions be cached in some way to improve performance? (relevant components as some components would be invalidated given devs iteration)

The short answer is no.

The only way to import implicitly-built pcm files is by asking Clang to import a module from source ...

..., but Clang caches the module compilation artifacts in the clang module cache. And you can point LLDB to the clang module cache used by your build. However, the chances that you will end up with the exact same module hash (a hash of the clang compiler flags) between LLDB and your build are very slim.

That's already happening. LLDB uses a persistent Clang module cache (settings show symbols.clang-modules-cache-path) and will reuse what was built in a previous debug session. But again, it's quite easy to end up with slightly different compiler flags, which affect the module hash.

1 Like

Thank you for your responses, @Adrian_Prantl !

I also would like to get your thoughts on a scenario I’ve observed:

I’ve noticed that when building an app target followed by a test target, they generate different hashes(directory within the clang's module cache), though the majority of the .pcm files within the directories are identical. For instance, both targets include the same variant of SwiftUI-<Hash>.pcm along with many other system modules.

I wanted to ask, in cases where project generator boundaries are rigid, is there any merit to exploring -fdisable-module-hash to reuse existing .pcm files? For example, many details like compiler arguments and target architecture (assuming all other options that contribute to the hash) are already known during the pre-build phases. This allows us to decide whether to continue using the existing cache or wipe it for the target build. If we can align this approach with expectations, it might work effectively. However, I’m unsure if clang/lldb has additional validations that could interfere with this method.

I’m just trying to understand if this has been attempted successfully and whether it’s worth considering.

I would say that short answer is that the efforts to make the different module configurations that have to be built in a project more transparent and to give users control over it have culminated in the explicitly-built modules design we have today.

2 Likes