DeepSeek-R1 trace generation details

Hi, 

We are having trouble reproducing the DeepSeek-R1 traces. When we generate our own DeepSeek-R1 traces and train on them, performance is noticeably lower than when we train using the released traces on Hugging Face, even with the same downstream setup.

Could you share a bit more detail on how the DeepSeek-R1 traces were generated?
Specifically:
- decoding parameters (e.g., temperature, top-p, max tokens)
- exact model / variant
- provider or inference setup used

Any pointers would be really appreciated. Thanks again for the great work!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DeepSeek-R1 trace generation details #130

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

DeepSeek-R1 trace generation details #130

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions