Trying to make sense of -stats-output-dir timings

In my efforts to analyze our build performance, I've been using the very helpful JSON files that -stats-output-dir spits out. But I'm struggling to make sense of some of the numbers based on my understanding of the compiler (which could be wrong) and what I'm seeing in the logs from the network file system daemon that our source files are read from.

This is a WMO build, so I have stats from the driver invocation and the single frontend invocation. To be extra-safe, I also defined SWIFTC_MAXIMUM_DETERMINISM to disable threading. (I also ran sudo purge to blow away macOS's in-memory file cache, because otherwise some of the file system calls wouldn't make it to FUSE so I could see them in the log, but I'm not sure if that's relevant aside from making the build take longer.)

According to the stats,

  • the driver started at Unix time 1562946109.388913 (8:41:49 AM) and wall clock time was 266.666 sec.
  • the frontend started at Unix time 1562946200.311746 (8:43:20 AM) and wall clock time was 175.606 sec.

So it looks like the driver is doing 91 seconds of work before it launches the frontend. I looked at my fs log to see what files were being read, and it looks like it's mostly reading module maps and header files. This module doesn't have a bridging header so I think that rules out PCH generation.

I thought that the driver could possibly be warming the module cache before invoking the frontend, but I didn't see any code that would do that, and all the timestamps of files in the module cache fall after that 91 seconds (i.e., during the frontend's execution).

What is the driver doing with these headers that takes this long?

1 Like

If you're sure the stats are coming from the driver, it's probably trying to decide if it needs to rebuild everything (because an external dependency changed) or just the files that changed. This can probably be drastically improved, but mostly hasn't been a source of attention because, well, compiling over a network filesystem isn't the common Xcode use case.

I'm just speculating that that's the problem based on what's here, though. Taking a profile of the driver process would be a lot more accurate.

cc @harlanhaskins for additional thoughts

Unfortunately I've had mixed success trying to reproduce this with Instruments.

Firstly, I think there's a bug (?) in the File Activity instrument where it starts counting time immediately when I click the Record button, even if it needs to pop up a dialog asking me for my password, and the amount of time that it takes me to do that is shown as a gap before the first mark in that graph. So I can't get the CPU profile and the file activity graphs lining up correctly :grimacing:

It looks like the driver isn't the issue though; I see the same odd behavior in the frontend. I ran the driver with -### to get the frontend command I could run directly. If I restart our FUSE process and do sudo purge, then run that frontend command wrapped in /usr/bin/time, I see wildly different wall times. In one case, /usr/bin/time reported 141.94 sec while the frontend's JSON file reported 53.87 sec—an 88 second gap.

Looking at the implementation of swift::performFrontend, the UnifiedStatsReporter gets created fairly late in the execution of that function. With a purged fs cache and recently restarted network fs daemon, could the missing 88 seconds be happening in some part of that function before stats begin getting collected? None of the code highlighted in the link above looked like it would be particularly long-running, though.

I'm pretty much willing to write this off as some kind of artifact of our FUSE daemon and/or purging macOS's in-memory fs cache. If I run the same command a second time, without purging anything, the gap shrinks significantly: /usr/bin/time reports 75.34 sec, frontend stats report 72.97 sec. But it's still puzzling that there is such a significant wall-time gap somewhere, unless there's something else that's going on that's just making the wall time it reports inaccurate in general.

It's possible, but I agree with you that it doesn't seem like anything should be hitting the filesystem here except for file lists, an output file map, and response files. Although maybe the stdlib gets imported before performCompile is actually called? I wouldn't think so, though. Definitely I think you've demonstrated that your filesystem is doing something.

I think you can avoid the password-on-every-run by running DevToolsSecurity on the command line, if you haven't already.