Workflow question: comparing SIL dumps

beccadax · October 23, 2018, 3:26am

This is more a question about how other folks work efficiently on the compiler, rather than a suggestion to change it.

I sometimes need to compare SIL dumps of the same function generated by two different versions of the compiler. Usually, the two have long stretches of code which are recognizably similar, but some instructions or basic blocks are present in one and not the other. Because SIL assigns increasing integers to each value/basic block, when one is missing it throws off everything for the rest of the function, which makes it hard to see the correspondence and understand what's different between them.

Does anyone have a good technique or tool for dealing with this? The solution I imagine would be to run a small tool which increments the numbers of all the instructions/basic blocks above the missing one (for instance, "bb14 on the left doesn't exist on the right, so change its bb14 to bb15, its bb15 to bb16, etc."), but maybe someone has a different suggestion.

Michael_Gottesman · October 23, 2018, 3:53pm

This is an annoying problem. For me what I do is use sed patterns to eliminate that before I compare. For instance, if you want to compare raw instruction sequences it is useful to eliminate all "register" numbers as well as the user comments.

I have never tried this with blocks though.

Andrew_Trick · October 23, 2018, 4:32pm

I've been using this silly python script with emacs extension to increment/decrement bb numbers above a given number in test cases:

#!/usr/bin/env python

import sys
import re

incrementBy = 1
if sys.argv[0].endswith("decrement-bb.py"):
    incrementBy = -1

if len(sys.argv) < 2:
    print "Usage: ", sys.argv[0], " <lowest bb # to increment>"
    exit(1)

lowestBBNum = int(sys.argv[1])

BBRE1 = re.compile("(?:^|br |, )bb([0-9]+)")
BBRE2 = re.compile(":[ ]+bb([0-9]+):")
for line in sys.stdin:
    bbNums = []
    matchRanges = []

    for m in BBRE1.finditer(line):
        matchRanges.append((m.start(1), m.end(1)))
        bbNums.append(int(m.group(1)))

    for m in BBRE2.finditer(line):
        matchRanges.append((m.start(1), m.end(1)))
        bbNums.append(int(m.group(1)))

    for bbNum, (start, end) in zip(reversed(bbNums), reversed(matchRanges)):
        if bbNum < lowestBBNum:
            continue
        line = line[0:start] + str(bbNum + incrementBy) + line[end:]

    print line,

(defvar sil-mode-inc-bb-program-name "inc-bb")
(defvar sil-mode-inc-bb-buffer-name "*inc-bb*")
(defvar sil-mode-inc-bb-script-path "/Users/atrick/work/scripts/llvm/increment-bb.py")

(defvar sil-mode-dec-bb-program-name "dec-bb")
(defvar sil-mode-dec-bb-buffer-name "*dec-bb*")
(defvar sil-mode-dec-bb-script-path "/Users/atrick/work/scripts/llvm/decrement-bb.py")

(defun inc-bb-helper(region-start region-end lowest-bb)
  ;;(interactive "r\nnLowest BB # to increment:")
  ;; First we need to find the previous '{' and then the next '}'
  (let ((process-connection-type nil))
    (let ((p (start-process sil-mode-inc-bb-program-name
                            sil-mode-inc-bb-buffer-name
                            python-shell-interpreter
                            sil-mode-inc-bb-script-path
                            (number-to-string lowest-bb))))
      (set-process-sentinel p #'ignore)
      (process-send-region p region-start region-end)
      (process-send-eof p)
      (delete-and-extract-region region-start region-end)
      (while (eq (process-status p) 'run)
        (accept-process-output p))
      (insert-buffer-substring sil-mode-inc-bb-buffer-name)
      (kill-buffer sil-mode-inc-bb-buffer-name))))

(defun inc-bb(lowest-bb)
  (interactive "nLowest BB # to increment:")
  ;; First we need to find the previous '{' and then the next '}'
  (save-mark-and-excursion
    (if (use-region-p)
        (inc-bb-helper (region-beginning) (region-end) lowest-bb)
      (let ((brace-start (search-backward "{"))
            (brace-end (search-forward "}")))
        (inc-bb-helper brace-start brace-end lowest-bb)))))

(defun dec-bb-helper(region-start region-end lowest-bb)
  ;;(interactive "r\nnLowest BB # to decrement:")
  ;; First we need to find the previous '{' and then the next '}'
  (let ((process-connection-type nil))
    (let ((p (start-process sil-mode-dec-bb-program-name
                            sil-mode-dec-bb-buffer-name
                            python-shell-interpreter
                            sil-mode-dec-bb-script-path
                            (number-to-string lowest-bb))))
      (set-process-sentinel p #'ignore)
      (process-send-region p region-start region-end)
      (process-send-eof p)
      (delete-and-extract-region region-start region-end)
      (while (eq (process-status p) 'run)
        (accept-process-output p))
      (insert-buffer-substring sil-mode-dec-bb-buffer-name)
      (kill-buffer sil-mode-dec-bb-buffer-name))))

(defun dec-bb(lowest-bb)
  (interactive "nLowest BB # to decrement:")
  ;; First we need to find the previous '{' and then the next '}'
  (save-mark-and-excursion
    (if (use-region-p)
        (dec-bb-helper (region-beginning) (region-end) lowest-bb)
      (let ((brace-start (search-backward "{"))
            (brace-end (search-forward "}")))
        (dec-bb-helper brace-start brace-end lowest-bb)))))

Michael_Gottesman · October 23, 2018, 5:53pm

@beccadax you should also check ./docs/DebuggingTheCompiler. If you find useful information that is not there you should add it for the benefit of all! = ).