Skip to content

Add expert choice routing mode to MoEFeedForward#8

Open
ahmedtaha100 wants to merge 1 commit intogoogle-deepmind:mainfrom
ahmedtaha100:expert_choice_routing
Open

Add expert choice routing mode to MoEFeedForward#8
ahmedtaha100 wants to merge 1 commit intogoogle-deepmind:mainfrom
ahmedtaha100:expert_choice_routing

Conversation

@ahmedtaha100
Copy link

Add expert-choice routing mode for MoEFeedForward

Adds expert-choice routing (Zhou et al., 2022) where each expert selectsits top-C tokens, providing natural load balancing without auxiliary losses.

Changes

  • model_lib.py: routing_mode field, _apply_expert_choice_moe(),restructured apply() with early routing-mode branch
  • config_lib.py: routing_mode in BaseExperimentConfig,lm_moe_test and lm_moe_expert_choice_test configs
  • model_lib_test.py: simple_expert_choice_moe() reference impl,forward/gradient equivalence tests

How to test

# Expert-choice MoE local testpython -m simply.main     --experiment_config lm_moe_expert_choice_test     --experiment_dir /tmp/moe_ec_test --alsologtostderr# All MoE unit tests (including existing + new)pytest simply/model_lib_test.py::MoETest -v

Design decisions

  • Follows _apply_dense_moe dispatch pattern (einsums, not sparse GMM)
  • Capacity: C = num_experts_per_token x num_tokens / num_experts
  • lbl_loss is skipped when routing_mode='expert_choice' (load isbalanced by construction, so the auxiliary loss is unnecessary)
  • routing_mode is validated; unknown values raise ValueError
  • Token-choice path is unchanged: all 7 existing simple_moe()equivalence tests pass with the same tolerances as before

@google-cla
Copy link

google-cla bot commented Feb 13, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@ahmedtaha100
Copy link
Author

I just applied for the google CLA

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant