-
Notifications
You must be signed in to change notification settings - Fork 766
Open
Description
Hi,
We are having trouble reproducing the DeepSeek-R1 traces. When we generate our own DeepSeek-R1 traces and train on them, performance is noticeably lower than when we train using the released traces on Hugging Face, even with the same downstream setup.
Could you share a bit more detail on how the DeepSeek-R1 traces were generated?
Specifically:
- decoding parameters (e.g., temperature, top-p, max tokens)
- exact model / variant
- provider or inference setup used
Any pointers would be really appreciated. Thanks again for the great work!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels