Lldb is slow to resolve local vars

We have a large Swift app with 150+ modules. Whenever I set breakpoint in Swift code in Xcode, LLDB takes at least 60 seconds to resolve types of local variables on hot pause and 5+ minutes if I kill lldb-rpc-server.

I've tried to investigate but can't find anything. What have I done:

  • used dwarf/dsym DEBUG_INFORMATION_FORMAT - this does not affect the performance
  • replaced all #import <framework> with @import framework; because I saw that lldb-rpc-server opens a bunch of header files in Activity Monitor - this does not affect either
  • set STRIP_SWIFT_SYMBOLS to YES and NO – not sure what this setting does and why is it NO by default even for debug builds
  • switched between wmo/incremental compilation mode for all targets - no effect either
  • disabled all LLDB extensions in ~/.lldb_init - it is empty now

I've instrumented lldb-rpc-server when Xcode has been spinning attempting to resolve the breakpoint, and apparently there is Swift module loading routines involved, and they take all the time:

Is there something I can do to speed up LLDB? The only apparent way I think of is to reduce module count but that sounds quite unacceptable.

7 Likes

Interestingly, LLDB performs much better in Obj-C code: for example breakpoints work nicely in main.m.

We have the same issue with the app at my workplace. We don't have nearly as many modules, probably only a dozen or so. However, it does seem to take an awfully long time to load the locals when hitting a breakpoint. In our case it seems to be 15-20 seconds.

  • used dwarf / dsym DEBUG_INFORMATION_FORMAT - this does not affect the performance

This will only affect the performance of name lookups in DWARF; so this isn't expected to make a big difference here.

  • replaced all #import <framework> with @import framework; because I saw that lldb-rpc-server opens a bunch of header files in Activity Monitor - this does not affect either

These are equivalent (iff module maps are defined).

  • set STRIP_SWIFT_SYMBOLS to YES and NO – not sure what this setting does and why is it NO by default even for debug builds

This would not make any difference speed-wise.

  • switched between wmo / incremental compilation mode for all targets - no effect either

It shouldn't make a difference here.

  • disabled all LLDB extensions in ~/.lldb_init - it is empty now

Can you attach a sample of lldb-rpc-server instead of the screenshot?

Do you have a bridging header? In contrast to Clang modules the compilation result of a bridging header cannot be cached. This could explain why restarting lldb-rpc-server incurs a large delay.

Here it is:

(I've assumed you meant Activity Monitor sample, did you not?)

We do have bridging header.

1 Like

I was expecting most of the time being spent compiling the bridging header, but the Clang compiler doesn't show up in the heaviest trace at all. Instead we are dealing with file system contention while loading Clang modules from the Clang module cache:

Total number in stack (recursive counted multiple, when >=5):
        383       stat$INODE64  (in libsystem_kernel.dylib) + 0  [0x7fff6852078c]
        349       (anonymous namespace)::RealFileSystem::status(llvm::Twine const&)  (in LLDB) + 265  [0x101429ea9]
        336       llvm::sys::fs::status(llvm::Twine const&, llvm::sys::fs::file_status&, bool)  (in LLDB) + 99  [0x101451d73]
        203       llvm::SimpleBitstreamCursor::Read(unsigned int)  (in LLDB) + 0  [0x101daeed0]
        199       clang::ASTReader::ReadASTCore(llvm::StringRef, clang::serialization::ModuleKind, clang::SourceLocation, clang::serialization::ModuleFile*, llvm::SmallVectorImpl<clang::ASTReader::ImportedModule>&, long long, long, clang::ASTFileSignature, unsigned int)  (in LLDB) + 2128  [0x103968cd0]
        185       llvm::vfs::FileSystem::exists(llvm::Twine const&)  (in LLDB) + 25  [0x101424e89]
        179       (anonymous namespace)::RealFileSystem::openFileForRead(llvm::Twine const&)  (in LLDB) + 194  [0x10142a062]
        179       llvm::sys::fs::openFile(llvm::Twine const&, int&, llvm::sys::fs::CreationDisposition, llvm::sys::fs::FileAccess, llvm::sys::fs::OpenFlags, unsigned int)  (in LLDB) + 219  [0x101454c4b]
        177       __open  (in libsystem_kernel.dylib) + 0  [0x7fff6851faf8]
        177       llvm::sys::fs::openFileForRead(llvm::Twine const&, int&, llvm::sys::fs::OpenFlags, llvm::SmallVectorImpl<char>*)  (in LLDB) + 48  [0x101452a40]
        173       malloc  (in libsystem_malloc.dylib) + 21  [0x7fff68593fe5]
        171       clang::Preprocessor::Lex(clang::Token&)  (in LLDB) + 68  [0x104345db4]
        166       llvm::vfs::FileSystem::getBufferForFile(llvm::Twine const&, long long, bool, bool)  (in LLDB) + 42  [0x101424d4a]
        164       malloc_zone_malloc  (in libsystem_malloc.dylib) + 104  [0x7fff6859406a]
        162       clang::ASTReader::ReadDeclRecord(unsigned int)  (in LLDB) + 1875  [0x1039d1623]
        156       clang::ASTDeclReader::Visit(clang::Decl*)  (in LLDB) + 21  [0x1039b2b95]
        134       clang::ASTReader::ReadControlBlock(clang::serialization::ModuleFile&, llvm::SmallVectorImpl<clang::ASTReader::ImportedModule>&, clang::serialization::ModuleFile const*, unsigned int)  (in LLDB) + 5849  [0x103967839]
        134       clang::ASTReader::readUnhashedControlBlock(clang::serialization::ModuleFile&, bool, unsigned int)  (in LLDB) + 106  [0x1039757ba]
        128       clang::ASTReader::ReadControlBlock(clang::serialization::ModuleFile&, llvm::SmallVectorImpl<clang::ASTReader::ImportedModule>&, clang::serialization::ModuleFile const*, unsigned int)  (in LLDB) + 3240  [0x103966e08]
        125       clang::serialization::ModuleManager::visit(llvm::function_ref<bool (clang::serialization::ModuleFile&)>, llvm::SmallPtrSetImpl<clang::serialization::ModuleFile*>*)  (in LLDB) + 1215  [0x103a6107f]
        122       szone_malloc_should_clear  (in libsystem_malloc.dylib) + 66  [0x7fff685952ae]
        119       clang::ASTReader::readUnhashedControlBlockImpl(clang::serialization::ModuleFile*, llvm::StringRef, unsigned int, bool, clang::ASTReaderListener*, bool)  (in LLDB) + 610  [0x103975bf2]
        118       _platform_memmove$VARIANT$Haswell  (in libsystem_platform.dylib) + 0  [0x7fff685d4740]
        118       operator new(unsigned long)  (in libc++abi.dylib) + 40  [0x7fff65548718]
        117       clang::ASTReader::GetDecl(unsigned int)  (in LLDB) + 87  [0x103973037]
        116       clang::FileManager::getStatValue(llvm::StringRef, llvm::vfs::Status&, bool, std::__1::unique_ptr<llvm::vfs::File, std::__1::default_delete<llvm::vfs::File> >*)  (in LLDB) + 182  [0x104393576]
        114       llvm::BitstreamCursor::readRecord(unsigned int, llvm::SmallVectorImpl<unsigned long long>&, llvm::StringRef*)  (in LLDB) + 0  [0x104488910]
        109       free_tiny  (in libsystem_malloc.dylib) + 468  [0x7fff68599167]
        102       llvm::StringMapImpl::LookupBucketFor(llvm::StringRef)  (in LLDB) + 0  [0x101413940]
        102       llvm::vfs::RedirectingFileSystem::openFileForRead(llvm::Twine const&)  (in LLDB) + 133  [0x101427df5]
        101       clang::FileManager::getBufferForFile(clang::FileEntry const*, bool, bool)  (in LLDB) + 330  [0x10439450a]
        100       clang::HeaderSearch::loadModuleMapFile(clang::DirectoryEntry const*, bool, bool)  (in LLDB) + 234  [0x1042ded1a]
1 Like

Roughly, how many dylibs that contain Swift code does your application have?

Debug build contains contains 177 frameworks in /Frameworks folder. This includes 20 libswift dylibs. Most of them are Swift frameworks. We have very minor number of pure ObjC frameworks, almost zero.

I've tested a release build, where we merge some frameworks to reduce dylib count. It has 66 dylibs in total (20 of them are libswifts), and breakpoints work much faster: 11 seconds instead of ~60.

Thanks you for providing the data. I'm afraid that I don't have an immediate tip for how to improve the performance, however, there are some changes that I am working on at the moment (TypeRef-based LLDB compiler types) that should address this contention by reducing the amount of clang compiler instances LLDB is creating from one per dylib to one per process.

Thanks!
When you are saying file system contention, do you mean disk I/O?

According to the sample you posted, most of the time is spent in the stat() system call, which I am guessing is spending most of its time waiting on some lock in the kernel.

most of the time is spent in the stat()

@beefon do you have any "Endpoint Security" installed, such as CarbonBlack or CrowdStrike? Or any other kernel extensions? As I understand, endpoint security usually hook into file system calls, such as stat. Endpoint security deployments are known to cause build performance issues (because of their overhead on all file IO), and issues with the simulator (which we hit a nasty bug related to).

To check, you can run:

kextstat -l | grep -v com.apple

@Dave_Lee, I only have one from Intel Power Gadget:

163 0 0xffffff7f846f3000 0x4000 0x4000 com.intel.driver.EnergyDriver (3.6.1) CF3549FA-B79E-34E2-BF25-6C5FEF5EF262 <8 6 5 3>

@Adrian_Prantl this is what I saw and that's why I decided to give dSYM a shot: in this case lldb does not stat all .o files at least. Apparently it still stats some other stuff.

You might want to disable that and see if it makes any difference.

Thanks @Dave_Lee , but it does not make any difference if I disable Intel kext.

@Adrian_Prantl , is there a task/issue on Swift tracker which we can follow? We can perhaps test if your changes make any significant improvement in developer workflow when you close the task and when new Xcode/LLDB comes out.

I don't have a JIRA ticket that tracks this work, but you can follow my pull requests for swift-lldb that refactor SwiftASTContext.

1 Like

Are there any updates on performance improvements?

I have recently been seeing similar issues, except significantly worse wait times for LLDB. Using recent snapshots of the Swift 5.2 toolchain and a larger app (a few hundred modules), LLDB takes 30+ minutes to resolve local variables when it hits the first breakpoint in Swift. The wait time used to be < 10 minutes, usually closer to 5 minutes. I'm wondering if there have been any performance regressions in this code path recently.

I captured a sample from LLDB while resolving local variables:

My efforts are part of a large-scale restructuring of LLDB ([WIP/RFC] Remove swift::ASTContext from lldb::SwiftASTContext by adrian-prantl · Pull Request #711 · apple/llvm-project · GitHub) and won't land in the short term.

If the main problem for you is APFS lock contention, you can try settings set target.swift-create-module-contexts-in-parallel false and see if that improves the situation in the mean time.

I'm running into the same thing, except in my case lldb eventually exhausts the heap and crashes. I'm running lldb-1100.0.30.11 / Apple Swift version 5.1.3 (swiftlang-1100.0.282.1 clang-1100.0.33.15) from Xcode 11.3, but this reproduced with earlier versions as well.

Here's a stack trace from the lldb crash when I run bt after catching an exception in a Swift app.

This is on a fairly large hybrid Swift/Obj-C app. Any attempt to backtrace or print locals takes 30-600 seconds and then crashes.

I also took a sample of lldb when this happened. It seems pretty similar to @beefon's.

I tried settings set target.swift-create-module-contexts-in-parallel false but lldb didn't like that:

(lldb) settings set target.swift-create-module-contexts-in-parallel false
error: invalid value path 'target.swift-create-module-contexts-in-parallel'