[CI] Reorg test file and fix moonlight oom by guapisolo · Pull Request #593 · radixark/miles

guapisolo · 2026-02-13T00:45:31Z

TODO: Replace moonlight-16B-A3B test by GLM-4.7-flash after new sglang docker bump and #566 merge. It's a 30b model with mla and mtp.

Upd: new dev docker has been bumped. wip #566

gemini-code-assist · 2026-02-13T00:45:58Z

Summary of Changes

Hello @guapisolo, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on refining the continuous integration (CI) test infrastructure by eliminating a significant number of redundant test files and updating configurations for existing tests. The changes aim to improve the efficiency and relevance of the CI pipeline, ensuring that tests reflect current model and hardware setups, and reducing unnecessary overhead. This cleanup is part of an ongoing effort to maintain a robust and optimized testing environment.

Highlights

Test File Deletion: Numerous redundant or outdated end-to-end test files have been removed from the tests/e2e/image and tests/e2e/precision directories, streamlining the test suite.
GPU Configuration Updates: Adjustments were made to GPU allocation parameters (actor-num-gpus-per-node and rollout-num-gpus) in test_qwen2.5_0.5B_gsm8k_async.py for improved resource utilization.
Model Parallelism and Memory Optimization: The test_moonlight_16B_A3B.py and test_moonlight_16B_A3B_r3.py files received updates to their tensor model parallel size and SGLang memory fraction settings, likely optimizing performance for specific hardware configurations.

Changelog

tests/e2e/image/test_mimo_7B_mtp_only_grad.py
- Removed.
tests/e2e/image/test_moonlight_16B_A3B.py
- Removed.
tests/e2e/image/test_quick_start_glm4_9B.py
- Removed.
tests/e2e/image/test_qwen2.5_0.5B_gsm8k.py
- Removed.
tests/e2e/image/test_qwen2.5_0.5B_gsm8k_async.py
- Removed.
tests/e2e/image/test_qwen2.5_0.5B_gsm8k_async_short.py
- Removed.
tests/e2e/image/test_qwen2.5_0.5B_gsm8k_short.py
- Removed.
tests/e2e/image/test_qwen3_0.6B_fsdp_colocated_2xGPU.py
- Removed.
tests/e2e/image/test_qwen3_0.6B_fsdp_distributed.py
- Removed.
tests/e2e/image/test_qwen3_0.6B_megatron_fsdp_align.py
- Removed.
tests/e2e/image/test_qwen3_0.6B_parallel_check.py
- Removed.
tests/e2e/image/test_qwen3_30B_A3B.py
- Removed.
tests/e2e/image/test_qwen3_4B_ckpt.py
- Removed.
tests/e2e/image/test_qwen3_4B_fsdp_true_on_policy.py
- Removed.
tests/e2e/image/test_qwen3_4B_ppo.py
- Removed.
tests/e2e/image/test_qwen3_vl_4B_fsdp.py
- Removed.
tests/e2e/long/test_qwen2.5_0.5B_gsm8k_async.py
- Updated actor-num-gpus-per-node from 4 to 2 and rollout-num-gpus from 4 to 6 when FEW_GPU is true.
tests/e2e/megatron/test_moonlight_16B_A3B.py
- Updated tensor-model-parallel-size from 2 to 4.
- Adjusted sglang-mem-fraction-static to be conditional on TIGHT_HOST_MEMORY.
tests/e2e/megatron/test_moonlight_16B_A3B_r3.py
- Updated tensor-model-parallel-size from 2 to 4.
- Adjusted sglang-mem-fraction-static to be conditional on TIGHT_HOST_MEMORY.
tests/e2e/precision/test_qwen3_0.6B_megatron_fsdp_align.py
- Removed.

Ignored Files

Ignored by pattern: .github/workflows/** (2)
- .github/workflows/pr-test.yml
- .github/workflows/pr-test.yml.j2

Activity

guapisolo initiated this pull request to delete redundant test files and reorganize the CI setup.
The author suggested replacing the moonlight-16B-A3B test with GLM-4.7-flash following a new SGLang docker bump and a merge in the radixark/miles repository.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request primarily reorganizes e2e test files and addresses an out-of-memory (OOM) issue with the moonlight model tests. The changes to fix the OOM, such as increasing tensor parallelism and adjusting sglang's static memory fraction, are appropriate. The GPU reallocation in the qwen async test is also a reasonable configuration adjustment. However, a large number of test files have been removed without corresponding additions in this PR. While this is labeled as a reorganization, the absence of the new files raises concerns about a potential loss of test coverage. It's important to ensure these tests are properly relocated or their functionality is otherwise preserved. I've also left a couple of comments regarding a potentially confusing variable name used for GPU memory configuration.

gemini-code-assist · 2026-02-13T00:48:25Z

tests/e2e/megatron/test_moonlight_16B_A3B.py


    sglang_args = (
-        "--rollout-num-gpus-per-engine 2 " "--sglang-mem-fraction-static 0.8 " "--sglang-max-running-requests 512 "
+        "--rollout-num-gpus-per-engine 2 " f"--sglang-mem-fraction-static {0.7 if TIGHT_HOST_MEMORY else 0.8} " "--sglang-max-running-requests 512 "


The variable TIGHT_HOST_MEMORY is used to configure --sglang-mem-fraction-static, which controls GPU memory allocation. This is confusing because the variable name implies it's related to host (CPU) memory, while the parameter affects device (GPU) memory. Other tests in this repository use a variable named TIGHT_DEVICE_MEMORY for this purpose. For clarity and consistency, it would be better to use a variable that accurately reflects its purpose, like TIGHT_DEVICE_MEMORY. This would involve changing the variable definition on line 5 as well.

gemini-code-assist · 2026-02-13T00:48:25Z

tests/e2e/megatron/test_moonlight_16B_A3B_r3.py


    sglang_args = (
-        "--rollout-num-gpus-per-engine 2 " "--sglang-mem-fraction-static 0.8 " "--sglang-max-running-requests 512 "
+        "--rollout-num-gpus-per-engine 2 " f"--sglang-mem-fraction-static {0.7 if TIGHT_HOST_MEMORY else 0.8} " "--sglang-max-running-requests 512 "


The variable TIGHT_HOST_MEMORY is used to configure --sglang-mem-fraction-static, which controls GPU memory allocation. This is confusing because the variable name implies it's related to host (CPU) memory, while the parameter affects device (GPU) memory. Other tests in this repository use a variable named TIGHT_DEVICE_MEMORY for this purpose. For clarity and consistency, it would be better to use a variable that accurately reflects its purpose, like TIGHT_DEVICE_MEMORY. This would involve changing the variable definition on line 5 as well.

yushengsu-thu

It looks good, but I have some advice.

Comments on deleting CI under image tags:

For the CI under image tags, I planned to run all tests under those images to validate the dev Docker, even though this may leave duplicate CI in different tags.

If we adopt the PR above:

We may need to update/add tags in the Docker release workflow (it seems this has not been added yet).
I think it’s better to add/update these tags every time we release a new Docker image.

yushengsu-thu

It looks good, but I have some advice.

Comments on deleting CI under image tags:

For the CI under image tags, I planned to run all tests under those images to validate the dev Docker, even though this may leave duplicate CI in different tags.

If we adopt the PR above:

We may need to update/add tags in the Docker release workflow (it seems this has not been added yet - #573).
I think it’s better to add/update these tags every time we release a new Docker image.

guapisolo requested a review from yushengsu-thu as a code owner February 13, 2026 00:45

guapisolo changed the title ~~[CI] Delete redundant test file and reorg~~ [CI] Reorg test file and fix moonlight oom Feb 13, 2026

gemini-code-assist bot reviewed Feb 13, 2026

View reviewed changes

guapisolo added 2 commits February 15, 2026 10:23

1st

2f8c8ac

fix 2

8c50455

guapisolo force-pushed the ci/reorg branch from cc47d4b to 8c50455 Compare February 15, 2026 18:23

guapisolo added the run-ci-image label Feb 15, 2026

yushengsu-thu reviewed Feb 16, 2026

View reviewed changes

yushengsu-thu approved these changes Feb 16, 2026

View reviewed changes

guapisolo added 2 commits February 16, 2026 05:47

better ci-image impl

bd78fd0

format

8ef732e

guapisolo merged commit eb8d974 into radixark:main Feb 17, 2026
58 of 60 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI] Reorg test file and fix moonlight oom#593

[CI] Reorg test file and fix moonlight oom#593
guapisolo merged 4 commits intoradixark:mainfrom
guapisolo:ci/reorg

guapisolo commented Feb 13, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Feb 13, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 13, 2026

Uh oh!

gemini-code-assist bot Feb 13, 2026

Uh oh!

yushengsu-thu left a comment •

edited

Loading

Uh oh!

yushengsu-thu left a comment •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

guapisolo commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Feb 13, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

yushengsu-thu left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yushengsu-thu left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

guapisolo commented Feb 13, 2026 •

edited

Loading

yushengsu-thu left a comment •

edited

Loading

yushengsu-thu left a comment •

edited

Loading