Skip to content

feat: add MiniMax M2 (MiniMaxM2ForCausalLM) support#19

Open
scottgl9 wants to merge 1 commit intoCerebrasResearch:mainfrom
scottgl9:feat/minimax-m2-support
Open

feat: add MiniMax M2 (MiniMaxM2ForCausalLM) support#19
scottgl9 wants to merge 1 commit intoCerebrasResearch:mainfrom
scottgl9:feat/minimax-m2-support

Conversation

@scottgl9
Copy link
Copy Markdown

Summary

Adds MODEL_ATTRS entry and observer config for MiniMaxM2ForCausalLM, enabling REAP expert pruning on MiniMax M2.x models (M2, M2.1, M2.5, M2.7).

Cerebras has already published REAP-pruned MiniMax M2 and M2.5 checkpoints on HuggingFace (MiniMax-M2.5-REAP-139B-A10B, MiniMax-M2.5-REAP-172B-A10B), but the corresponding model support was not upstreamed. This patch enables the community to reproduce those results and apply REAP to MiniMax M2.7.

Changes

src/reap/model_util.py

Adds MiniMaxM2ForCausalLM to MODEL_ATTRS:

  • moe_block: "block_sparse_moe" — the MoE attribute on MiniMaxM2DecoderLayer
  • gate_proj: "w1", up_proj: "w3", down_proj: "w2"MiniMaxM2MLP uses non-standard weight names
  • router: "gate", num_experts: "num_local_experts"

src/reap/observer.py

Adds MiniMaxM2MoEObserverHookConfig hooking onto MiniMaxM2SparseMoeBlock. MiniMaxM2Experts exposes .num_experts and .top_k directly, matching the base class defaults — no attribute overrides needed.

Verification

Weight name mapping confirmed against:

Applies to all MiniMax M2 variants sharing the MiniMaxM2ForCausalLM architecture: M2, M2.1, M2.5, M2.7.

Add MODEL_ATTRS entry and observer config for MiniMaxM2ForCausalLM,
enabling REAP expert pruning on MiniMax M2.x models (M2, M2.1, M2.5, M2.7).

MiniMaxM2MLP uses non-standard weight names (w1/w2/w3) instead of
gate_proj/up_proj/down_proj. The MoE block is accessed via
block_sparse_moe on MiniMaxM2DecoderLayer, and the router is gate.

Weight name mapping confirmed against:
- HF modeling_minimax_m2.py (MiniMaxM2MLP class)
- SGLang minimax_m2.py (ckpt_gate_proj_name=w1, ckpt_down_proj_name=w2,
  ckpt_up_proj_name=w3)
- yujiepan/minimax-m2.5-tiny-random model structure printout

MiniMaxM2Experts exposes .num_experts and .top_k directly, matching the
MoETransformerObserverConfig base class defaults — no overrides needed.

Note: Cerebras has published MiniMax-M2 and M2.5 REAP checkpoints on
HuggingFace but the corresponding model support was not upstreamed.
This patch enables the community to reproduce those results.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant