Refactor clustering to use compressed memberships#456
Refactor clustering to use compressed memberships#456ocg-goodfire wants to merge 4 commits intodevfrom
Conversation
|
@claude Please review this. mostly laser-focus on correctness - this should be an interface-preserving efficiency improvement. |
|
Claude finished @ocg-goodfire's task —— View job Code Review CompletedTodo List:
Code Review SummaryThis is a solid efficiency improvement that successfully introduces hybrid sparse/bitset compressed memberships while preserving the existing dense clustering interface. The refactor follows good engineering practices with clean separation of concerns. ✅ StrengthsWell-designed core data structure (
Excellent interface preservation:
Smart orchestration (
Efficient memory management:
🔍 Areas to VerifyBitwise operations correctness:
Intersection counting across representations:
Union operations:
📊 Performance CharacteristicsMemory efficiency gains:
Computational complexity:
🎯 Correctness AssessmentMathematics: All set operations (intersection, union) are implemented correctly with proper sorted array merging and bit manipulation. Edge cases: Code handles empty sets, single elements, and memory-optimized representation switching properly. Interface contracts: All function signatures and return types preserved. Error conditions and assertions maintained. Multi-batch handling: The 🔬 Minor Suggestions
⚡ Final AssessmentThis PR successfully delivers what it promises - an interface-preserving efficiency improvement. The compressed membership approach is mathematically sound, well-implemented, and properly integrated. The hybrid sparse/bitset design is clever and should provide significant memory savings for typical clustering workloads. Recommendation: ✅ Approve - This is high-quality code that maintains correctness while delivering meaningful performance improvements. |
|
@claude thanks for you review. Can you please review the 2 new commits since? again - mostly concerned with correctness. |
|
Claude finished @ocg-goodfire's task —— View job Review of 2 New Commits Since Previous ReviewTodo List:
SummaryBoth commits look mathematically sound and correctly implemented. The key optimization introduces a Commit Analysis1.
|
|
@claude sorry one more how's that last commit look |
|
Claude finished @ocg-goodfire's task —— View job Review of Latest Commit
|

[entirely by codex 5.4]
Summary
n_samples, and keep the old dense path as a fallback whenactivation_thresholdisNonenumbakernel over a cached sample-by-component CSR matrixWhat Changed
The original exact thresholded clustering path materialized dense sample-by-group state and then recomputed merged-group overlaps with Python-level set operations. That made memory scale badly with token count, and once memory was fixed, the per-merge overlap update became the main runtime bottleneck.
This PR changes the exact thresholded path in two stages:
X.T @ Xn_samplescompute_merge_costs()runs therenumba-compiled row-oriented kernel instead of doing Python-level membership intersections against every current groupSemantics are still exact for the thresholded path. The new row-oriented overlap update computes the same counts as the old direct membership intersection path; it just does it with compiled array code over the original component activity matrix and current group assignments.
Performance Notes
Jose-like synthetic benchmark matching the old real run scale after dead filtering:
500,000sampled tokens9,981alive componentsl0 ~= 200Measured progression:
1.90s/iter0.66s/iternumbaoverlap update: about0.095s/iterCurrent Jose-like exact runtime at that scale:
2.5s75s0.095s/iter~9100iterations: about16 minutesmerge-side total at500ksampled tokensThat means the exact merge kernel is now roughly:
~20xfaster than the original compressed path~6.9xfaster than the previous branch state before the row-oriented updateOn the memory side, the compressed membership representation already reduced token-scaling memory dramatically versus the old dense thresholded path, so the current branch materially improves both:
Validation
python3 -m compileall spd/clusteringuv run pytest tests/clustering/test_merge_integration.py tests/clustering/test_filter_dead_components.py tests/clustering/test_merge_pair_samplers.pyuv run pytest tests/clustering/test_merge_integration.py -q