Different performance of async code when launched from Terminal vs from built-in terminal of the VS Code

Today I've wasted about 6h debugging why my benchmark is about 50% slower than expected, just to realise that when started from the Terminal of the VS Code my benchmark is running approximately 1.5 slower. And all this time I was comparing benchmark executed from VS Code against baseline recorded from Terminal app.

I've checked priority of both cases using ps -l and both are running at priority 31 and nice=0.

I'm super puzzled about this, and struggling to even to plan how to investigate this.

Might as well be some silly thing like the IO hitting the vscode console being slow or some similarly funny reason? Does the app print a lot?

Meh, you’re saying benchmark so I guess it’s not printing things. I was assuming it may be a system with logging etc

Benchmark involves two actors that create jobs for each other. They visit incomplete binary tree where each node on even levels is processed on one actor, and each node on odd levels - by another. When each node is processed, children nodes are scheduled for processing on another actor. Benchmark measures time between scheduling root node and processing of the last node. DispatchGroup is used to wait until last node is processed (after processing each node group.leave() is called, group.enter() is called before scheduling the root node). No I/O happens within that time. Results are printed after measurements.

So the subsystems that I suspect so far:

  • Memory allocator
  • Scheduling of the jobs by the actor
  • Scheduling of the jobs by global executor (GCD)
  • Logging inside concurrency runtime

I see that VS Code sets MallocNanoZone=0 in process environment, but adding it to the Terminal environment does not reproduce the issue. Other differences in the environment look benign, but I'll check them.

In Terminal with default environment I get performance of about 170ns per node.
In Terminal with MallocNanoZone=0 (or even entire environment from VS Code terminal), I get about 260ns per node.
But still in VS Code terminal, I get about 320ns per node.

Aside from process environment, what else can I inspect for differences?

1 Like