Add initial Qwen3.5-4B support and SGLang text-runtime patch by ShenAC-SAC · Pull Request #39 · Gen-Verse/OpenClaw-RL

ShenAC-SAC · 2026-03-19T06:17:14Z

Summary

This PR adds initial Qwen3.5 support for the personal OpenClaw training paths:

OpenClaw Combine
OpenClaw RL
OpenClaw OPD

The goal of this PR is to wire Qwen3.5 into the current OpenClaw/SLIME/Megatron stack and verify that the model can reach a real startup smoke path in the existing training/runtime flow.

Main Changes

Added Qwen3.5 integration across the current stack:

slime/slime_plugins/models/qwen3_5.py
slime/slime_plugins/mbridge/qwen3_5.py
slime/slime/backends/megatron_utils/megatron_to_hf/qwen3_5.py

Added or updated Qwen3.5 launch scripts:

slime/scripts/models/qwen3.5-4B.sh
openclaw-combine/run_qwen35_4b_openclaw_combine.sh
openclaw-rl/run_qwen35_4b_openclaw_rl.sh
openclaw-opd/run_qwen35_4b_openclaw_opd.sh

Added repo-side SGLang compatibility work for Qwen3.5 rollout serving:

slime/slime/backends/sglang_utils/qwen3_5.py
slime/slime/backends/sglang_utils/sglang_engine.py

Added a low-resource debug-rollout startup fix for PRM-based methods:

slime/slime/ray/placement_group.py

Dependency / Runtime Notes

Qwen3.5 support does not work with the current repo pin transformers==4.57.1.

During isolated validation, Qwen3.5 required a newer Transformers build that provides transformers.models.qwen3_5.
The validation environment used:

transformers==5.2.0

For runtime validation, I also used a newer SGLang-related stack in Docker:

sglang==0.5.9
flashinfer-python==0.6.6
sgl-kernel==0.3.21

For the reduced-GPU combine validation path, the environment also needed:

numpy<2 (Megatron currently rejects NumPy 2.x)
pylatexenc
wandb

I have not updated the repo-wide dependency pin in this PR yet, because I wanted to first confirm the Qwen3.5 path can actually run through the current training/runtime stack before proposing a broader dependency change.

Validation Completed

Completed checks:

Python syntax check for Qwen3.5 integration files
Shell syntax check for Qwen3.5 launch scripts
Qwen3.5 model spec consistency check
isolated environment verification that transformers supports qwen3_5
local checkpoint load validation for Qwen3.5-4B
Dockerized single-GPU SGLang smoke test for the Qwen3.5 text rollout path
reduced-GPU openclaw-combine startup smoke with PRM enabled

Single-GPU serving smoke result:

server startup succeeded
/health_generate returned 200
/model_info returned 200
/generate returned 200

Reduced-GPU combine smoke result:

validated with a reduced-GPU --debug-rollout-only startup path
fixed placement-group allocation so debug_rollout_only can still reserve PRM GPUs
fixed SGLang server-arg handling so a Qwen3.5 shadow text-only checkpoint is not incorrectly forced back through language_only -> encoder_urls validation
verified startup now progresses through:
- Ray placement-group creation
- RolloutManager creation
- rollout router startup
- PRM router startup
- Qwen3.5 shadow text-only checkpoint preparation
- rollout SGLang engine launch and weight loading
- PRM SGLang engine launch and weight loading
- /health_generate, /server_info, and /model_info
- OpenClaw OPD proxy startup
- policy server ready
- final model is ready banner

This smoke run was intentionally bounded with a timeout after the stack became ready; it did not stop because of a Qwen3.5 startup crash.

Remaining Notes

Still not completed in this PR validation:

full multi-step training verification beyond startup smoke
broader multi-GPU validation
repo-wide dependency pin update decision

openclaw-test was not run here. It depends on a separately running OpenClaw gateway plus an external user-model endpoint/token, which was outside the repo-local Qwen3.5 training validation setup used for this PR.

Why This PR Is Ready For Review

At this point the main repo-side goal is met:

Qwen3.5 is wired into the existing stack
the runtime compatibility blockers in this branch have been addressed
the model now reaches a real reduced-resource OpenClaw Combine startup smoke path instead of failing immediately during import, conversion, or SGLang initialization

Follow-up work such as broader dependency pinning or more exhaustive training validation can still be done in subsequent PRs if preferred.

沈傲宸 and others added 6 commits March 18, 2026 19:32

feat: add initial Qwen3.5 support for personal OpenClaw methods

4a1b815

fix: make qwen35 launch scripts portable

d377f2d

fix: make qwen35 scripts safer for shared hosts

9b552f5

feat: wire qwen35 rollout compatibility

4b3e6f3

fix: patch qwen35 sglang text runtime

42c791e

fix: support qwen35 debug rollout startup

cfe1161

ShenAC-SAC changed the title ~~[Draft] Add initial Qwen3.5-4B support and SGLang text-runtime patch~~ Add initial Qwen3.5-4B support and SGLang text-runtime patch Mar 19, 2026

ShenAC-SAC marked this pull request as ready for review March 19, 2026 07:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add initial Qwen3.5-4B support and SGLang text-runtime patch#39

Add initial Qwen3.5-4B support and SGLang text-runtime patch#39
ShenAC-SAC wants to merge 6 commits intoGen-Verse:mainfrom
ShenAC-SAC:sac/qwen35-support

ShenAC-SAC commented Mar 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ShenAC-SAC commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Main Changes

Dependency / Runtime Notes

Validation Completed

Remaining Notes

Why This PR Is Ready For Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ShenAC-SAC commented Mar 19, 2026 •

edited

Loading