llm-rlhf

Here are 2 public repositories matching this topic...

[NeurIPS 2025 Spotlight] LLM post-training suite — featuring ReasonFlux, ReasonFlux-PRM, and ReasonFlux-Coder.

realize the reinforcement learning training for gpt2 llama bloom and so on llm model

lora reward trl llm rlhf trlx llm-rlhf

Add a description, image, and links to the llm-rlhf topic page so that developers can more easily learn about it.

To associate your repository with the llm-rlhf topic, visit your repo's landing page and select "manage topics."