Add coefficient scheduling (warmup + anneal) to importance minimality loss by Antovigo · Pull Request #439 · goodfire-ai/spd

Antovigo · 2026-03-12T05:48:00Z

Description

Add coefficient scheduling to the importance minimality loss. Four new config fields on ImportanceMinimalityLossConfig:

coeff_peak_multiplier — multiplier applied to the loss coeff at the peak
coeff_anneal_start_frac / coeff_anneal_end_frac — linearly anneal the multiplier back to 1.0
coeff_warmup_frac — linearly ramp the loss coefficient from 0 to the peak value (coeff*coeff_peak_multiplier) over this fraction of training

Also adds a config validator ensuring scheduling fractions are ordered correctly (including for the existing p_anneal_* fields), and moves the p_anneal ordering assertion from the loss function into the validator for consistency.

Related Issue

NA

Motivation and Context

Allows experimenting with non-constant importance minimality loss weighting — e.g. starting with a stronger sparsity pressure and relaxing it, or warming up the loss gradually.

How Has This Been Tested?

Tested on a pile_llama transformer and on the resid_mlp2 toy model.

Does this PR introduce a breaking change?

No. The default values for all new fields preserve existing behavior.

… loss

…t in optim_cis

Antovigo added 2 commits March 11, 2026 15:58

Add coefficient scheduling (warmup + anneal) to importance minimality…

f003774

… loss

Pass coeff scheduling params to importance_minimality_loss_per_elemen…

2115721

…t in optim_cis

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add coefficient scheduling (warmup + anneal) to importance minimality loss#439

Add coefficient scheduling (warmup + anneal) to importance minimality loss#439
Antovigo wants to merge 2 commits intodevfrom
feature/impmin_scheduling

Antovigo commented Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Antovigo commented Mar 12, 2026

Description

Related Issue

Motivation and Context

How Has This Been Tested?

Does this PR introduce a breaking change?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant