Skip to content

[BUG] IR is not in strict SSA form #14

@arnoox

Description

@arnoox

Issue: IR is not in strict SSA form — locals can be redefined within a block

Context

The IR builder (crates/herkos/src/ir/builder/) claims SSA form in its comments and doc-strings, but the write side of Wasm locals violates the single-definition rule. This surfaces as a correctness comment in local_cse.rs (lines 111–113) that works around the violation. It also limits all optimizer passes to block-local scope.


Root Cause

In ir/builder/core.rs (translate_function, lines 380–401), one VarId is pre-allocated per Wasm local and stored in local_vars[i]. That slot is reused for every write to the local.

In ir/builder/translate.rs, LocalSet and LocalTee always emit to the same destination VarId:

// LocalSet arm — same VarId every time local i is written
let dest = self.local_vars.get(*local_index as usize).copied();
self.emit(IrInstr::Assign { dest, src: value });

If a loop body writes local.set $i twice, the same VarId is the destination of two instructions — breaking SSA's single-definition invariant.

LocalGet already does the right thing (allocates a fresh VarId per read, line 66–68), so reads are already renamed. Only writes are broken.


Impact

Current workaround

local_cse.rs invalidates stale CSE entries whenever a variable is redefined within a block (lines 120–123). This is correct but adds O(entries) overhead per definition and silently caps CSE to straight-line sequences with no re-definition.

Optimization ceiling

Every pass that could benefit from a single-definition guarantee falls back to conservative block-local analysis:

Pass Current scope Possible with SSA
CSE intra-block, with invalidation global (across blocks)
Const propagation intra-block global through join points
Copy propagation intra-block global
Dead store elim not yet implemented trivially: def with no uses
Strength reduction not yet implemented trivially: pattern on defs

Proposed Fix

Step 1 — Add IrInstr::Phi to ir/types.rs

/// SSA phi node: at a join point, select the reaching definition.
Phi {
    dest: VarId,
    /// (predecessor block, value from that predecessor)
    srcs: Vec<(BlockId, VarId)>,
},

Step 2 — Rename on LocalSet/LocalTee in ir/builder/translate.rs

Instead of reusing local_vars[i], allocate a fresh VarId and update the slot:

Operator::LocalSet { local_index } => {
    let value = self.value_stack.pop()?;
    let dest = self.new_var();                         // fresh name
    self.emit(IrInstr::Assign { dest, src: value });
    self.local_vars[*local_index as usize] = dest;    // update reaching def
}

This gives correct SSA within straight-line code immediately.

Step 3 — Insert phi nodes at join points in ir/builder/core.rs

At push_control/pop_control (block and loop ends), for each local whose local_vars[i] differs across predecessors, insert a Phi instruction in the join block and update local_vars[i] to point to the phi's dest.

The recommended algorithm is Braun et al. 2013 ("Simple and Efficient Construction of Static Single Assignment Form"). It is well-suited to the existing streaming translation style:

  • No pre-pass to compute dominance frontiers
  • Phis are inserted lazily and pruned if they turn out to be trivial
  • Works block-by-block, fitting the current IrBuilder state machine

Step 4 — Update all optimizer passes

Remove the CSE invalidation workaround (lines 111–123 in local_cse.rs). Update const_prop, copy_prop, dead_instrs, and any new passes to handle (or skip) Phi nodes. Phi nodes are pure and eligible for CSE/const-prop if all sources are identical.

Step 5 — Lower phis before codegen

Add an SSA-destruction pass (trivial: each Phi { dest, srcs } becomes parallel assignments on each predecessor edge, then let dest = src in the join block). The existing result_var mechanism in ControlFrame is already doing this manually for block results — the new pass generalizes it.


Files Affected

File Change
crates/herkos/src/ir/types.rs Add IrInstr::Phi variant
crates/herkos/src/ir/builder/translate.rs Rename on LocalSet/LocalTee
crates/herkos/src/ir/builder/core.rs Phi insertion at join points
crates/herkos/src/optimizer/local_cse.rs Remove invalidation workaround
crates/herkos/src/optimizer/const_prop.rs Handle Phi nodes
crates/herkos/src/optimizer/copy_prop.rs Handle Phi nodes
crates/herkos/src/optimizer/dead_instrs.rs Handle Phi nodes
crates/herkos/src/optimizer/utils.rs instr_dest for Phi
crates/herkos/src/codegen/ Lower phis before emission

Non-Goals

  • This does not change the output semantics — transpiled Rust is unaffected
  • WASI, memory model, or runtime crate are untouched
  • No new dependencies needed (Braun et al. is implementable in ~150 lines of Rust)

References

  • Braun et al. (2013): Simple and Efficient Construction of Static Single Assignment Form — CC 2013. doi:10.1007/978-3-642-37051-9_6
  • Current workaround: crates/herkos/src/optimizer/local_cse.rs, lines 111–123
  • Local variable allocation: crates/herkos/src/ir/builder/core.rs, lines 380–401
  • LocalSet translation: crates/herkos/src/ir/builder/translate.rs, lines 71–86

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions