[docs] add CRD references to README and workflow guides#155
Open
yuanzhi-zhu wants to merge 1 commit intoX-GenGroup:mainfrom
Open
[docs] add CRD references to README and workflow guides#155yuanzhi-zhu wants to merge 1 commit intoX-GenGroup:mainfrom
yuanzhi-zhu wants to merge 1 commit intoX-GenGroup:mainfrom
Conversation
CRD was merged in X-GenGroup#121 but the algorithm wasn't listed in the user-facing docs alongside the other algorithms. This commit: - README.md: add CRD row to the supported-algorithms table; add CRD to the Algorithms doc-pointer description. - examples/README.md: add 'crd' to the algorithm enum. - guidance/workflow.md: add CRD rows to the trajectory-policy and optimization-strategy tables; add CRD to the decoupled-sampling and fresh-timestep prose bullets. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Updates user-facing documentation to include the previously-merged CRD (Centered Reward Distillation) algorithm alongside existing trainers, ensuring the README, examples guide, and workflow guide are consistent with the codebase’s supported algorithms.
Changes:
- Add CRD to the top-level README “Supported Algorithms” table and the guidance document pointer description.
- Add
crdto theexamples/directory-structure “algorithm” enumeration. - Add CRD to the workflow guide’s sampling/trajectory and optimization strategy tables + related explanatory bullets.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| README.md | Adds CRD to supported algorithms and to the Algorithms guidance description. |
| examples/README.md | Updates the documented examples/{algorithm}/... enum to include crd. |
| guidance/workflow.md | Documents CRD behavior in sampling/trajectory and optimization sections. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| | AWM | awm | [Advantage Weighted Matching](https://arxiv.org/abs/2509.25050) | | ||
| | DGPO | dgpo | [DGPO](https://arxiv.org/abs/2510.08425) | | ||
| | GRPO-Guard | grpo-guard | [GRPO-Guard](https://arxiv.org/abs/2510.22319) | | ||
| | CRD | crd | [Centered Reward Distillation](https://arxiv.org/abs/2603.14128) ([Blog (chinese)](https://mp.weixin.qq.com/s/fpTi7PPi3APSNJQ2kXN3Dw))| |
| | AWM | awm | [Advantage Weighted Matching](https://arxiv.org/abs/2509.25050) | | ||
| | DGPO | dgpo | [DGPO](https://arxiv.org/abs/2510.08425) | | ||
| | GRPO-Guard | grpo-guard | [GRPO-Guard](https://arxiv.org/abs/2510.22319) | | ||
| | CRD | crd | [Centered Reward Distillation](https://arxiv.org/abs/2603.14128) ([Blog (chinese)](https://mp.weixin.qq.com/s/fpTi7PPi3APSNJQ2kXN3Dw))| |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
CRD was merged in #121 but the algorithm wasn't listed in the user-facing docs alongside the other algorithms. This commit: