Fix #444: Add pre-allocation option for MCMC strategy to prevent OOM by eladerez · Pull Request #891 · nerfstudio-project/gsplat

eladerez · 2026-03-14T20:28:21Z

The problem
MCMC training crashes with OOM errors at ~18M Gaussians due to memory fragmentation from repeated torch.cat operations during the densification stage.

Solution:
Added an optional "preallocate" flag to "MCMCStrategy" that:
Pre-allocates buffers to "cap_max" size at initialization
Writing new Gaussians into pre-allocated slots instead of concatenating

Changes:
gsplat/strategy/mcmc.py: Added "preallocate" field and "n_active" state tracking
gsplat/strategy/ops.py: Modified sample_add() for support to buffer writes
examples/simple_trainer.py: Managing buffer and checkpoint

Performance:
Memory: 6.4% reduction at 1M Gaussians (1.216GB vs 1.299GB baseline)
Trade-off: ~24% slower rendering (opt-in feature, disabled by default)
Benefit: Prevents OOM at 18M+ Gaussians scale

Testing:
All strategy tests passing (2/2)

How to run:
python examples/simple_trainer.py mcmc --strategy.preallocate

…gy to prevent OOM - Add optional flag to MCMCStrategy (default: False) - Pre-allocate buffers to cap_max size to avoid torch.cat memory spikes - Track active Gaussians separately from buffer size with n_active - Modify sample_add() to write into pre-allocated buffer instead of concat - Add checkpoint support for n_active state - Memory reduction: 6.4% at 1M Gaussians (1.216GB vs 1.299GB baseline) - Prevents OOM crashes at 18M+ Gaussians while maintaining compatibility This is an opt-in feature that eliminates memory fragmentation from repeated torch.cat operations during MCMC densification. Users can enable it with --strategy.preallocate flag when using MCMC strategy.

…lice via param_groups The original preallocate fix allocated cap_max-sized parameter tensors upfront but let each optimizer track the full buffer. This caused optimizer.step() to process all cap_max elements every iteration — a 20x+ overhead when cap_max=1_000_000 and n_initial≈50k. Fix: - Add grow_active_params() in ops.py: creates a narrow Parameter view into splats[:n_active] and migrates Adam momentum tensors (zero-pad new rows) when the active count grows. - Wire each optimizer to track the active-slice Parameter at init time (simple_trainer.py), so optimizer.step() only touches n_active rows. - Grow the active slice (and optimizer state) inside _add_new_gs after sample_add writes new Gaussians into the pre-allocated buffer. - Use active_params directly in _relocate_gs and noise injection, removing the per-step temporary Parameter allocation. Timing: cap_max=100_000 vs cap_max=500 with identical n_active=200 shows a 1.01x ratio (vs ~200x before), confirmed by the new test_mcmc_preallocate_time_independent_of_cap_max test. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

eladerez · 2026-03-18T10:08:21Z

I've submitted PR #891 that addresses this with pre-allocated buffers. Would appreciate feedback.

eladerez and others added 2 commits March 14, 2026 22:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix #444: Add pre-allocation option for MCMC strategy to prevent OOM#891

Fix #444: Add pre-allocation option for MCMC strategy to prevent OOM#891
eladerez wants to merge 2 commits intonerfstudio-project:mainfrom
eladerez:fix/mcmc-memory-preallocation

eladerez commented Mar 14, 2026 •

edited

Loading

Uh oh!

eladerez commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

eladerez commented Mar 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eladerez commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

eladerez commented Mar 14, 2026 •

edited

Loading