feat: add Weaver SDK as third RL training backend#28
Open
void-main wants to merge 4 commits intoaiming-lab:mainfrom
Open
feat: add Weaver SDK as third RL training backend#28void-main wants to merge 4 commits intoaiming-lab:mainfrom
void-main wants to merge 4 commits intoaiming-lab:mainfrom
Conversation
Integrate nex-agi/weaver as a new RL backend alongside Tinker and MinT. A compatibility adapter (weaver_compat.py) wraps Weaver's synchronous API into the async interface expected by MetaClaw's trainer, so upper- layer code (trainer.py, api_server.py, data_formatter.py) requires no changes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add Weaver configuration examples, install instructions, and acknowledgment links alongside the existing Tinker and MinT entries across all 13 README files. Also remove the rl-weaver optional dependency group from pyproject.toml β Weaver SDK is installed separately, matching the MinT pattern. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The trainer stores `service_client` as a local variable, so the wrapper (and its inner Weaver ServiceClient) could be GC'd while the training and sampling clients still reference it β causing "ServiceClient is not connected" errors. Fix: propagate the inner ServiceClient reference through the wrapper chain (_TrainingClientWrapper β _SamplingClientWrapper) so it stays alive. Also remove `__del__` which could close the connection prematurely. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Weaver's importance_sampling loss function requires an explicit loss_mask field in loss_fn_inputs. Tinker rejects unknown keys, so MetaClaw's data_formatter omits it and encodes the information in the advantages (0.0 for masked positions). The compat layer now derives loss_mask from advantages before calling forward_backward. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

What is Weaver?
Weaver is a cloud-based RL training platform by NexAGI that supports LoRA fine-tuning, similar in scope to Tinker and MinT. It provides synchronous training and sampling APIs with session-managed connections. Like MinT, the
nex-weaverSDK is installed separately β MetaClaw does not bundle it as a default dependency.Summary
nex-agi/weaver) as a third RL training backend alongside Tinker and MinTmetaclaw/weaver_compat.pyadapter that wraps Weaver's synchronous API into the async*_async()interface expected by MetaClaw, so no changes needed in trainer.py, api_server.py, data_formatter.py, or rollout.pysdk_backend.pywith auto-detection viaWEAVER_API_KEY/WEAVER_BASE_URLenv vars or URL containing "weaver"Key adapter behaviors
EncodedTextChunkβ alias for Weaver'sModelInputChunkTensorData.from_torch()β delegates to Weaver'sfrom_array()*_async()methods βasyncio.to_thread(inner.sync_method, wait=True)SimpleNamespacefor attribute access (response.sequences[0].tokens)ppo,cispo) emit a warning logServiceClientauto-callsconnect()and providesclose()for cleanupTest plan
tests/test_sdk_backend.py(7 existing + 5 new)test_resolve_sdk_backend_explicit_weaverβ explicit backend=weaver resolves correctlytest_auto_detects_weaver_from_envβ WEAVER_API_KEY triggers auto-detectiontest_auto_detects_weaver_from_urlβ URL containing "weaver" triggers auto-detectiontest_explicit_weaver_requires_sdkβ clear error when nex-weaver not installedtest_weaver_env_orderβ WEAVER_* env vars have correct priorityπ€ Generated with Claude Code