SourceKitService only using 50% of CPU

Hi :wave:

Context

My machine is an 2019 macbook pro, 8 core CPU. This allows for 16 threads. Thus, when I run yes > /dev/null & 8 times, I am getting a 50% CPU utilisation, and when running it 16 times I get 100% utilisation:

The problem

I am running the analyze command from swiftlint. When doing that, I observe the following:

  • The CPU utilization usually stays around 50%, even though swiftlint is using concurrentPerform when analyzing and is correctly starting 16 threads:

    Screenshot 2021-11-29 at 02.01.33

  • There is one unique SourceKitService that takes care of executing all requests

  • I did manually update the source code of swiftlint to start 32 threads, but the CPU utilisation was the same: around 50%

Seeing above, I am wondering if maybe the single instance of SourceKitService is wrongly limiting the amount of threads it uses to process the requests, not taking full advantage of the available computing power.

  • Could someone explain these observations? Is this expected, or is it a bug?
  • Is there any way to indicate SourceKitService how many threads to use when handling incoming requests?

Thanks! :raised_hands:

2 Likes

I don't have much to add, but I do look forward to someone more familiar with the inner workings of SourceKit to share their thoughts here.

PS: I can't say for certain, but you may get more engaged reactions from folks who work on SourceKit if you find a way to demonstrate this behavior without involving complex integrations like SwiftLint & SourceKitten.

This is expected. There is one service process per client process.

This may be expected behaviour for sourcekitd depending on which requests are being sent. AST building for most requests is funneled through a serial queue. The only exceptions are the code-completion request, which has its own independent serial queue, and the "index" request, which should run fully in parallel. The serial behaviour is based on the kind of usage sourcekitd gets from clients like editors, where usually there is no reason to want to have two ASTs built in parallel (usually it's better to cancel any existing AST build when a new request comes in for a new edit or for a new document).

If you're using one of the requests currently handled serially, and you think it ought to be done in parallel for your use-case, we could consider some kind of configuration option for that, but it would depend on exactly what you're trying to do.

CC @akyrtzi @rintaro @ahoppen

Thanks you very much for the context :sparkles:

Some context about swiftlint: it creates a new thread per analyzed file. This means that for analyzing 20 files, the SourceKitService will be receiving requests from up to 16 different threads at the same time (in my i9 machine explained above).

AST building for most requests is funneled through a serial queue

Reading this makes me think that in the case of swiftlint, creating one new instance of SourceKitService per file analyzed may be more performant.

usually there is no reason to want to have two ASTs built in parallel

What if the ASTs are for different files? (I am clueless here, no idea how ASTs are built when a request happens)

If you're using one of the requests currently handled serially, and you think it ought to be done in parallel for your use-case, we could consider some kind of configuration option for that, but it would depend on exactly what you're trying to do.

I think @jpsim can help here, he has more context than me about which requests are being used in SourceKitten :pray:

Reading this makes me think that in the case of swiftlint, creating one new instance of SourceKitService per file analyzed may be more performant.

Launching a new SourceKitService instance per file might be duplicating some work but it would definitely get you more parallelism. One thing you could try is dividing the input files into #CPUs sets and launching multiple swiftlint sub-processes, each of which has its own SourceKitService.

What if the ASTs are for different files? (I am clueless here, no idea how ASTs are built when a request happens)

We keep old ASTs in a cache but since I can’t think of a scenario where you’re editing multiple files at once, there’s no need to concurrently update multiple ASTs.

I think @jpsim can help here, he has more context than me about which requests are being used in SourceKitten :pray:

I think understanding which requests are being sent would be important. Personally, I’m wondering how you manage to reach ~800% CPU coverage. As @blangmuir mentioned, most requests are funneled through two serial queues. You could try running swiftlint with the environment variable SOURCEKIT_SERVICE_LOG=1 set. That should output a log of the SourceKit requests that were sent.

1 Like
Terms of Service

Privacy Policy

Cookie Policy