Skip to content

viratzzs/alignment-without-rewards

Repository files navigation

TODO

  • eval scripts Mix both instead of separating validation and test
  • run evals on base model
  • OPD script
  • OPD run
  • GRPO script
  • GRPO run
  • run evals on rl model
  • OPSD script
  • OPSD run

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors