-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Issue: IR is not in strict SSA form — locals can be redefined within a block
Context
The IR builder (crates/herkos/src/ir/builder/) claims SSA form in its comments and doc-strings, but the write side of Wasm locals violates the single-definition rule. This surfaces as a correctness comment in local_cse.rs (lines 111–113) that works around the violation. It also limits all optimizer passes to block-local scope.
Root Cause
In ir/builder/core.rs (translate_function, lines 380–401), one VarId is pre-allocated per Wasm local and stored in local_vars[i]. That slot is reused for every write to the local.
In ir/builder/translate.rs, LocalSet and LocalTee always emit to the same destination VarId:
// LocalSet arm — same VarId every time local i is written
let dest = self.local_vars.get(*local_index as usize).copied();
self.emit(IrInstr::Assign { dest, src: value });If a loop body writes local.set $i twice, the same VarId is the destination of two instructions — breaking SSA's single-definition invariant.
LocalGet already does the right thing (allocates a fresh VarId per read, line 66–68), so reads are already renamed. Only writes are broken.
Impact
Current workaround
local_cse.rs invalidates stale CSE entries whenever a variable is redefined within a block (lines 120–123). This is correct but adds O(entries) overhead per definition and silently caps CSE to straight-line sequences with no re-definition.
Optimization ceiling
Every pass that could benefit from a single-definition guarantee falls back to conservative block-local analysis:
| Pass | Current scope | Possible with SSA |
|---|---|---|
| CSE | intra-block, with invalidation | global (across blocks) |
| Const propagation | intra-block | global through join points |
| Copy propagation | intra-block | global |
| Dead store elim | not yet implemented | trivially: def with no uses |
| Strength reduction | not yet implemented | trivially: pattern on defs |
Proposed Fix
Step 1 — Add IrInstr::Phi to ir/types.rs
/// SSA phi node: at a join point, select the reaching definition.
Phi {
dest: VarId,
/// (predecessor block, value from that predecessor)
srcs: Vec<(BlockId, VarId)>,
},Step 2 — Rename on LocalSet/LocalTee in ir/builder/translate.rs
Instead of reusing local_vars[i], allocate a fresh VarId and update the slot:
Operator::LocalSet { local_index } => {
let value = self.value_stack.pop()?;
let dest = self.new_var(); // fresh name
self.emit(IrInstr::Assign { dest, src: value });
self.local_vars[*local_index as usize] = dest; // update reaching def
}This gives correct SSA within straight-line code immediately.
Step 3 — Insert phi nodes at join points in ir/builder/core.rs
At push_control/pop_control (block and loop ends), for each local whose local_vars[i] differs across predecessors, insert a Phi instruction in the join block and update local_vars[i] to point to the phi's dest.
The recommended algorithm is Braun et al. 2013 ("Simple and Efficient Construction of Static Single Assignment Form"). It is well-suited to the existing streaming translation style:
- No pre-pass to compute dominance frontiers
- Phis are inserted lazily and pruned if they turn out to be trivial
- Works block-by-block, fitting the current
IrBuilderstate machine
Step 4 — Update all optimizer passes
Remove the CSE invalidation workaround (lines 111–123 in local_cse.rs). Update const_prop, copy_prop, dead_instrs, and any new passes to handle (or skip) Phi nodes. Phi nodes are pure and eligible for CSE/const-prop if all sources are identical.
Step 5 — Lower phis before codegen
Add an SSA-destruction pass (trivial: each Phi { dest, srcs } becomes parallel assignments on each predecessor edge, then let dest = src in the join block). The existing result_var mechanism in ControlFrame is already doing this manually for block results — the new pass generalizes it.
Files Affected
| File | Change |
|---|---|
crates/herkos/src/ir/types.rs |
Add IrInstr::Phi variant |
crates/herkos/src/ir/builder/translate.rs |
Rename on LocalSet/LocalTee |
crates/herkos/src/ir/builder/core.rs |
Phi insertion at join points |
crates/herkos/src/optimizer/local_cse.rs |
Remove invalidation workaround |
crates/herkos/src/optimizer/const_prop.rs |
Handle Phi nodes |
crates/herkos/src/optimizer/copy_prop.rs |
Handle Phi nodes |
crates/herkos/src/optimizer/dead_instrs.rs |
Handle Phi nodes |
crates/herkos/src/optimizer/utils.rs |
instr_dest for Phi |
crates/herkos/src/codegen/ |
Lower phis before emission |
Non-Goals
- This does not change the output semantics — transpiled Rust is unaffected
- WASI, memory model, or runtime crate are untouched
- No new dependencies needed (Braun et al. is implementable in ~150 lines of Rust)
References
- Braun et al. (2013): Simple and Efficient Construction of Static Single Assignment Form — CC 2013. doi:10.1007/978-3-642-37051-9_6
- Current workaround:
crates/herkos/src/optimizer/local_cse.rs, lines 111–123 - Local variable allocation:
crates/herkos/src/ir/builder/core.rs, lines 380–401 LocalSettranslation:crates/herkos/src/ir/builder/translate.rs, lines 71–86