Skip to content

Debugging Optimizing Compiler

Dibyendu Majumdar edited this page May 24, 2025 · 15 revisions

Introduction

Some notes on how I debug the compiler.

Problem

The following example failed to run when run through the SSA/optimization pipeline.

func sieve(N: Int)->[Int]
{
    // The main Sieve array
    var ary = new [Int]{len=N,value=0}
    // The primes less than N
    var primes = new [Int]{len=N/2,value=0}
    // Number of primes so far, searching at index p
    var nprimes = 0
    var p=2
    // Find primes while p^2 < N
    while( p*p < N ) {
        // skip marked non-primes
        while( ary[p] ) {
            p = p + 1
        }
        // p is now a prime
        primes[nprimes] = p
        nprimes = nprimes+1
        // Mark out the rest non-primes
        var i = p + p
        while( i < N ) {
            ary[i] = 1
            i = i + p
        }
        p = p + 1
    }

    // Now just collect the remaining primes, no more marking
    while ( p < N ) {
        if( !ary[p] ) {
            primes[nprimes] = p
            nprimes = nprimes + 1
        }
        p = p + 1
    }

    // Copy/shrink the result array
    var rez = new [Int]{len=nprimes,value=0}
    var j = 0
    while( j < nprimes ) {
        rez[j] = primes[j]
        j = j + 1
    }
    return rez
}
func eq(a: [Int], b: [Int], n: Int)->Int
{
    var result = 1
    var i = 0
    while (i < n)
    {
        if (a[i] != b[i])
        {
            result = 0
            break
        }
        i = i + 1
    }
    return result
}

func main()->Int
{
    var rez = sieve(20)
    var expected = new [Int]{2,3,5,7,11,13,17,19}
    return eq(rez,expected,8)
}

Error during execution indicated attempt to access a stack location (in the function frame) where no value was present.

I used debugger to put a break point on the exception and looked at the instruction / call stack. The failure appeared to occur in sieve function.

Below is part of the final IR leading up to the problem line:

L0:
    arg N_0
    %t8_0 = New([Int], len=N_0, initValue=0)
    %t10_0 = N_0/2
    %t9_0 = New([Int], len=%t10_0, initValue=0)
    %t19_0 = 2
    %t15_0 = 0
    goto  L2
L2:
    %t11_0 = %t19_0*%t19_0
    %t12_0 = %t11_0<N_0
    if %t12_0 goto L3 else goto L4
L3:
    %t14_0 = %t24_0
    goto  L5
L5:
    %t13_0 = %t8_0[%t14_0]

The issue above is that %t14_0 is assigned %t24_0 but latter is not defined yet...

The Incremental SSA version failed too, but the error was different, so indicated a different issue.

Clearly the generated IR is incorrect.

My first step was to disable the SCCP passes to see if it made a difference. The same error occurred.

Next step was to create an SSA test case and see if the transform to SSA was correct. I found that there was an issue with the sieve function, when transforming via the Dominator method. Braun's method seemed okay, but given the test using this was also failing, albeit with a different error, this indicated that there was more than just 1 error happening.

My next step was to try to create a small reproducible test case, initially just for the SSA transform.

I noticed that the error was happening with regards to the variable p. This variable is used in two while loops, and some conditional branches inside the loop. I thought maybe that the issue was related to the fact that the variable was interacting badly with arrays, or there was an issue because of its use in two separate loops.

I eliminated the arrays access and saw that the problem still occurred.

I was able to cut down the problem case to the following snippet:

func bug(N: Int)
{
    var p=2
    while( p < N ) {
        if (p) {
            p = p + 1
        }
    }
    while ( p < N ) {
        p = p + 1
    }
}

This generated following under the Dominator based SSA construction:

L0:
    arg N_0
    p_0 = 2
    goto  L2
L2:
    p_1 = phi(p_0, p_5)
    %t2_0 = p_1<N_0
    if %t2_0 goto L3 else goto L4
L3:
    if p_2 goto L5 else goto L6
L5:
    %t3_0 = p_2+1
    p_4 = %t3_0
    goto  L6
L6:
    p_5 = phi(p_2, p_4)
    goto  L2
L4:
    goto  L7
L7:
    p_2 = phi(p_1, p_3)
    %t4_0 = p_2<N_0
    if %t4_0 goto L8 else goto L9
L8:
    %t5_0 = p_2+1
    p_3 = %t5_0
    goto  L7
L9:
    goto  L1
L1:

The issue is in L3 basic block, where p2 is accessed but this is not yet defined.

Braun's method appears to generate correct SSA:

L0:
    arg N
    p = 2
    goto  L2
L2:
    p_1 = phi(p, p_3)
    %t2 = p_1<N
    if %t2 goto L3 else goto L4
L3:
    if p_1 goto L5 else goto L6
L5:
    %t5 = p_1+1
    p_2 = %t5
    goto  L6
L6:
    p_3 = phi(p_1, p_2)
    goto  L2
L4:
    goto  L7
L7:
    p_4 = phi(p_1, p_5)
    %t9 = p_4<N_1
    if %t9 goto L8 else goto L9
L8:
    %t12 = p_4+1
    p_5 = %t12
    goto  L7
L9:
    goto  L1
L1:

Given that even the test cases failed when using Braun's method of SSA construction, I think that more than one error is present. Since we also use a Dominator based method to exit from SSA, potentially this could also be where there are issues, in which case it is worth checking that the Dominator tree is correct for above program.

Control Flow Graph

Here is the control flow graph

bug1-cflow

Dominator Tree

Next up lets look at the Dominator Tree:

pre-ssa-domtree

Well, the DOM tree looks correct to me.

So our next stop is to check the liveness calculation and the Dominance Frontiers.

Dominance Frontiers

The calculated values are:

L0:
L2: L2
L4:
L7: L7
L9:
L1:
L8: L7
L3: L2
L5: L6
L6: L2

To check this we have to look back at the CFG and consult the Dominator Tree for each block. We look for edges going out from a BB's domination region (all blocks dominated by it plus itself).

  • L0 seems correct
  • L2 dominates all blocks except L0; but there is an edge from the dominated region back to L2 (from L6). So this looks correct.
  • L4 dominates L4,L7,L8,L9 and L1, and we have no edges going out from this region, so L4's DF is empty.
  • L7 dominates L7,L8,L9,L1 but we have an edge from L8 back to L7, so L7's DF contains L7.
  • L9 and L1's DF are trivially correct.
  • L8 dominates only itself, but has edge going out to L7, hence L8's DF contains L7.
  • L3 dominates L3,L5,L6, but we have an edge from this region to L2, hence L3's DF contains L2.
  • L5 dominates itself, but has outgoing edge to L6, hence L5's DF contains L6.
  • L6 dominates itself, but has outgoing edge to L2, hence L6's DF contains L2.

In summary the DF calculation looks correct. To doubly verify this, I would like to implement an alternative DF calculator and confirm results are the same.

The original DF calculation was based on description in Engineering a Compiler by Cooper & Torczon. I implemented a second method described by Appel in Modern Compiler Implementation in C - discovering that there is an error in the printed edition, and the book's Errata gives the correction. The two methods agree on the DF sets.

Pre-SSA Liveness

The liveness calculation is as follows

func bug(N: Int)
L0:
    arg N
    p = 2
    goto  L2
    #UEVAR   = {}
    #VARKILL = {N, p}  
    #LIVEIN  = {}
    #LIVEOUT = {N, p}
L2:
    %t2 = p<N
    if %t2 goto L3 else goto L4
    #UEVAR   = {N, p}
    #VARKILL = {%t2}
    #LIVEIN  = {N, p}
    #LIVEOUT = {N, p}
L3:
    if p goto L5 else goto L6
    #UEVAR   = {p}
    #VARKILL = {}
    #LIVEIN  = {N, p}
    #LIVEOUT = {N, p}
L5:
    %t3 = p+1
    p = %t3
    goto  L6
    #UEVAR   = {p}
    #VARKILL = {p, %t3}
    #LIVEIN  = {N, p}
    #LIVEOUT = {N, p}
L6:
    goto  L2
    #UEVAR   = {}
    #VARKILL = {}
    #LIVEIN  = {N, p}
    #LIVEOUT = {N, p}
L4:
    goto  L7
    #UEVAR   = {}
    #VARKILL = {}
    #LIVEIN  = {N, p}
    #LIVEOUT = {N, p}
L7:
    %t4 = p<N
    if %t4 goto L8 else goto L9
    #UEVAR   = {N, p}
    #VARKILL = {%t4}
    #LIVEIN  = {N, p}
    #LIVEOUT = {N, p}
L8:
    %t5 = p+1
    p = %t5
    goto  L7
    #UEVAR   = {p}
    #VARKILL = {p, %t5}
    #LIVEIN  = {N, p}
    #LIVEOUT = {N, p}
L9:
    goto  L1
    #UEVAR   = {}
    #VARKILL = {}
    #LIVEIN  = {}
    #LIVEOUT = {}
L1:
    #UEVAR   = {}
    #VARKILL = {}
    #LIVEIN  = {}
    #LIVEOUT = {}

The liveness calculation looks correct to me.

I was tempted to start this investigation in the SSA transformation step that is responsible for renaming vars, because the bug seems to indicate erroneous renaming. However, it was important to verify that the inputs to the SSA transform step are correct, so above was worthwhile.

SSA Renaming

I finally started to investigate why the variable p was being renamed to p_2 instead of p_1 in Basic Block L3.

It was useful to put a log message in the search() method that walks down the DOM tree.

Searching in block L0
Searching in block L2
Searching in block L4
Searching in block L7
Searching in block L9
Searching in block L1
Searching in block L8
Searching in block L3
Searching in block L5
Searching in block L6

This shows that L3 is visited later - its sibling L4 and the tree below it is visited first. This gave me the hint that perhaps the rename stack for p is not being popped fully as the search() unwinds; so I used the Debuggger to verify what was on the name stack at the point of renaming.

It turned out that this was indeed the case - the stack size was bigger than expected.

I looked at the search method and saw that new name is pushed on the stack in two places:

  • When processing Phi instructions
  • When processing regular instructions that defined vars

However, when popping the stack, only the regular instruction was handled. So that seemed to be the bug.

I modified the stack popping code to this:

        for (Instruction i: block.instructions) {
            if (i.definesVar()) {
                var reg = i.def();
                stacks[reg.nonSSAId()].pop();
            }
            else if (i instanceof Instruction.Phi phi) {
                var reg = phi.value();
                stacks[reg.nonSSAId()].pop();
            }
        }

Reran the test and this appeared to have fixed the issue.

Conclusion

This bug was caused by a choice I made in the IR design; Phi instructions define and use vars, but I kept the interface of Phi distinct to ensure that Phis were not mistaken as regular instructions during Liveness calculation. I also wanted to be deliberate about when/how a Phi was processed versus other instructions. This decision meant that the code that checked that an instruction defines a var or not, did not handle the Phi instruction.

More testing and careful review is necessary to ensure that

  • The same error does not happen in other scenarios
  • Explore merging the distinct interfaces into one to avoid such mistakes
Clone this wiki locally