Skip to content

feat(train_reward_model): add chatml formatting and aggregation of more statistics#21

Open
maxreciprocate wants to merge 2 commits intomainfrom
update-reward-trainer
Open

feat(train_reward_model): add chatml formatting and aggregation of more statistics#21
maxreciprocate wants to merge 2 commits intomainfrom
update-reward-trainer

Conversation

@maxreciprocate
Copy link
Collaborator

No description provided.

for consistency, formatting has to happen either through tokenizer's
`apply_chat_format` or throught ahead of time formatting in the dataset
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant