Skip to content

Revert label propagation: restore DisjointSetsAsm for correctness#137

Merged
ekg merged 1 commit intorust-2from
revert-label-prop
Mar 27, 2026
Merged

Revert label propagation: restore DisjointSetsAsm for correctness#137
ekg merged 1 commit intorust-2from
revert-label-prop

Conversation

@ekg
Copy link
Copy Markdown
Collaborator

@ekg ekg commented Mar 27, 2026

Summary

The Shiloach-Vishkin label propagation introduced in PR #135 has a race condition that causes incorrect equivalence classes on larger inputs. Specifically, parent[max(la,lb)] = min(la,lb) with Relaxed ordering can link unrelated positions when la/lb are stale, causing base mismatches in write_graph_chunk.

This reverts to DisjointSetsAsm (lock-free CAS-based union-find) for correctness while keeping all other optimizations from #135:

  • Spanning tree BFS for fast discovery
  • Flood-fill orphan recovery
  • Per-sequence iitree queries
  • Adjacency list for spanning tree lookup

The bug

Base mismatch in disjoint set 34548: expected 'G' (from first member),
got fwd='T' rev='A' at global_pos=39029, seq_id=1.

Reproduced on USP14 region (chr18:68526-280967, 527 sequences, 212kb).

Test results

Region Before (label prop) After (DisjointSetsAsm)
MHC (466 seqs) 2m17s, correct 4m12s, correct
USP14 (527 seqs) CRASH 20m00s, correct

The Shiloach-Vishkin label propagation had a race condition where
parent[max(la,lb)] = min(la,lb) could link unrelated positions when
la/lb were stale. This caused base mismatches in write_graph_chunk
on larger inputs (e.g., USP14 region with 527 sequences, 212kb).

Revert to DisjointSetsAsm (lock-free CAS-based union-find) which
is proven correct. Keep all other optimizations: spanning tree BFS,
orphan recovery, per-sequence iitree queries, adjacency list.

MHC (466 seqs): 4m12s, correct (2727 nodes, 52047bp)
USP14 (527 seqs): 19m60s, correct (10199 nodes, 236011bp) — was crashing
@ekg ekg merged commit b65a7e0 into rust-2 Mar 27, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant