[feat] gpt-oss support v1 by guapisolo · Pull Request #585 · radixark/miles

guapisolo · 2026-02-11T21:50:31Z

Exists:

SGLang full gpt-oss inference support
Megatron gpt-oss training support (bf16, mxfp4)

Steps:

20b-bf16 [model] Add gpt-oss-20b-bf16 script #590
20b-fp4 acc verify.
- fp4 weight upd

Will no cover in this PR:

fp4 ckpt conversion. Which affect:
- Ckpt load/save. Current mbridge does not support native low precision ckpt conversion. Using Megeatron-Bridge cannot do ckpt conversion directly.
- KL loss.
Need further investigate Megeatron-Bridge.

gemini-code-assist · 2026-02-11T21:50:50Z

Summary of Changes

Hello @guapisolo, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces foundational support for GPT-OSS models within the system, specifically integrating them for SGLang inference and Megatron training. A key enhancement is the introduction of a configurable pad_token_id, which provides greater flexibility in handling token padding. Additionally, new scripts have been added to streamline the setup and execution of the gpt-oss-20b model, facilitating its use in various training and evaluation scenarios.

Highlights

GPT-OSS Support: Introduced initial support for GPT-OSS models, enabling both SGLang inference and Megatron training capabilities.
Configurable Padding Token ID: Implemented a configurable pad_token_id argument, allowing users to specify the padding token ID during training and inference processes.
New Model Scripts: Added dedicated shell scripts for configuring and running the gpt-oss-20b model, including specific architecture and training parameters.

Changelog

miles/backends/fsdp_utils/actor.py
- Updated calls to get_batch within _compute_log_prob and _train_core to pass the newly introduced pad_token_id argument.
miles/backends/megatron_utils/model.py
- Updated calls to get_batch within forward_step to pass the new pad_token_id argument.
miles/backends/training_utils/data.py
- Modified the get_batch function to accept pad_token_id as a parameter.
- Removed the hardcoded pad_token_id = 0 assignment within the get_batch function.
miles/utils/arguments.py
- Added a new command-line argument --pad-token-id to allow configuration of the padding token ID.
scripts/models/gpt-oss-20b.sh
- Added a new script defining the model architecture arguments for the gpt-oss-20b model.
scripts/run-gpt-oss-20b.sh
- Added a new script to orchestrate the training run for the gpt-oss-20b model, including various performance, optimizer, and SGLang arguments.

Activity

No human activity has occurred on this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces support for gpt-oss models by making pad_token_id configurable and adding new scripts for gpt-oss-20b. A high-severity security vulnerability was found in scripts/run-gpt-oss-20b.sh due to the exposure of the WandB API key as a command-line argument; it is strongly recommended to use environment variables for secrets. Furthermore, the script scripts/run-gpt-oss-20b.sh requires improvements in process cleanup and handling of undefined variables for robustness.

gemini-code-assist · 2026-02-11T21:55:55Z

scripts/run-gpt-oss-20b.sh

+   --use-wandb
+   --wandb-project miles-mgt-oss
+   --wandb-group "20b-bf16"
+   --wandb-key ${WANDB_API_KEY}


The script passes the WandB API key as a command-line argument: --wandb-key ${WANDB_API_KEY}. This is a high-risk practice because command-line arguments are often visible to other users on the same system via process listing commands (e.g., ps aux). Exposing the API key in this manner could lead to unauthorized access to the WandB project, potentially allowing an attacker to view or manipulate experiment data. The WandB API key should be passed through an environment variable, which is more secure than a command-line argument. The wandb library automatically reads the WANDB_API_KEY environment variable. Remove the --wandb-key argument from the script and ensure the WANDB_API_KEY environment variable is set in the execution environment.

gemini-code-assist · 2026-02-11T21:55:55Z

scripts/run-gpt-oss-20b.sh

+pkill -9 sglang
+sleep 3
+ray stop --force
+pkill -9 ray
+pkill -9 python
+sleep 3
+pkill -9 ray
+pkill -9 python


The process cleanup logic at the start of the script is overly aggressive and contains redundancies. Using pkill -9 python is particularly risky as it can terminate unrelated Python processes on the system. It's safer to rely on ray stop --force for Ray processes and be more specific with other pkill commands. The repeated commands are also unnecessary.

Suggested change

pkill -9 sglang

sleep 3

ray stop --force

pkill -9 ray

pkill -9 python

sleep 3

pkill -9 ray

pkill -9 python

# for rerun the task

ray stop --force

pkill -f sglang

sleep 3

pkill -9 -f sglang

gemini-code-assist · 2026-02-11T21:55:55Z

scripts/run-gpt-oss-20b.sh

+   # Must use --qkv-format bshd for the fused backend to work with this model's attention pattern.
+   --qkv-format bshd
+   --attention-backend fused
+)


The EVAL_ARGS array is used in the ray job submit command but is not defined in the script. This will cause evaluation-related arguments to be missed. You should define EVAL_ARGS, even if it's empty, to prevent potential errors and improve script clarity.

Suggested change

)

)

EVAL_ARGS=(

# Add evaluation arguments here, e.g.:

# --eval-interval 100

)

gemini-code-assist · 2026-02-11T21:55:55Z

scripts/run-gpt-oss-20b.sh

+set -ex
+
+# will prevent ray from buffering stdout/stderr
+export PYTHONBUFFERED=16


The environment variable PYTHONBUFFERED is not a standard way to control Python's output buffering. The correct variable for this purpose is PYTHONUNBUFFERED. Setting it to 1 will disable output buffering, which is generally desired for logging in distributed environments like Ray.

Suggested change

export PYTHONBUFFERED=16

export PYTHONUNBUFFERED=1

gemini-code-assist · 2026-02-11T21:55:55Z

scripts/run-gpt-oss-20b.sh

+  \"env_vars\": {
+    \"PYTHONPATH\": \"/root/Megatron-LM/\",
+    \"CUDA_DEVICE_MAX_CONNECTIONS\": \"1\",
+    \"NCCL_NVLS_ENABLE\": \"${HAS_NVLINK}\"


The HAS_NVLINK variable is used without a default value. If this variable is not set in the execution environment, it will expand to an empty string, potentially causing an invalid configuration for NCCL_NVLS_ENABLE. It's safer to provide a default value.

Suggested change

\"NCCL_NVLS_ENABLE\": \"${HAS_NVLINK}\"

"NCCL_NVLS_ENABLE": "${HAS_NVLINK:-0}"

guapisolo added 4 commits February 11, 2026 02:31

add gpt oss 20b model script

e3c657c

add run gpt oss 20b model script

fe788bc

fix script

dae3a97

add pad token id parsing

337b036

gemini-code-assist bot reviewed Feb 11, 2026

View reviewed changes

tiny fix

21ac45a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feat] gpt-oss support v1#585

[feat] gpt-oss support v1#585
guapisolo wants to merge 5 commits intoradixark:mainfrom
guapisolo:feat/gpt_oss_fp4

guapisolo commented Feb 11, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Feb 11, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 11, 2026

Uh oh!

gemini-code-assist bot Feb 11, 2026

Uh oh!

gemini-code-assist bot Feb 11, 2026

Uh oh!

gemini-code-assist bot Feb 11, 2026

Uh oh!

gemini-code-assist bot Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

-)
+)
+EVAL_ARGS=(
+   # Add evaluation arguments here, e.g.:
+   # --eval-interval 100
+)

	\"NCCL_NVLS_ENABLE\": \"${HAS_NVLINK}\"
	"NCCL_NVLS_ENABLE": "${HAS_NVLINK:-0}"

Conversation

guapisolo commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Feb 11, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

guapisolo commented Feb 11, 2026 •

edited

Loading