Why are the same passes sometimes registered multiple times in the pass pipeline?

a question occurred to me during this discussion, but i didn't want to lump it in there so as not to detract from the focus of the topic: i was wondering if there is documentation somewhere, or if someone cares to enlighten me, as to why the optimization pass pipeline appears to sometimes register & run the same passes multiple times. the specific context prompting the question was regarding the 'copy propagation' pass, which is added to the pipeline a number of times, at least two of which are in the addFunctionPasses method (here & here). i did read through the 'optimizer design' doc, but it didn't appear to directly address this question.

This is a general property of optimizing compilers. Sometimes A B A C A produces better code than A B C, because C might reveal new optimization opportunities for A and vice versa.

4 Likes

thanks! that makes sense – out of curiosity, how is the ordering and number of repetitions of passes generally determined? is it more of a 'designed' (theoretical) or 'discovered' (empirical) process (i assume it's a mix). e.g. i imagine that something like dead code elimination should generally be run early-ish, since there's no point in optimizing dead code, but don't have much intuition in this domain.

As far as I am aware, optimization pass ordering is such a chaotic process that most scheduling for it is based purely on guesswork and trial and error. Only a very small number of passes have an obvious ordering. This goes for all optimizing compilers, not just LLVM.

2 Likes

Mostly the latter, but in some cases you can prove that it is sufficient to run a certain pass once before another one, etc. There is also an art to designing passes that perform multiple analyses simultaneously in a way that's stronger than running any two passes until fixed point.

For example, suppose you have a constant propagation pass, and a conditional branch simplification pass. If you apply them in any order or any number of times to the below code, you won't eliminate the branch inside the loop:

var x = 0
var y = 0

while someCondition() {
  if x == 0 {
    y = 1
  } else {
    x += 1
  }
}

However, a combined pass (conditional constant propagation) will see that x is always zero, and the branch inside the loop just sets y to 1. A later pass might then move the y = 1 outside of the loop, and even delete the loop (assuming someCondition() has no side effects).

Dead code elimination is a classic example of a pass that you probably want to run multiple times. You might want to get rid of dead code as early as possible as you said, but other optimization passes can leave behind dead code also, so you can't just run it once.

5 Likes