Question: temperature/top_p for training data generation

Hi team, thanks for open sourcing DFlash!

I have a question about the dataset generation pipeline for training: What settings do you use for temperature and top_p when regenerating responses from target/reference models for annotation or synthetic data creation? Is there an official recommended value?

Are there best practices or trade-offs to consider for sampling parameters, to ensure the generated data distribution is well-aligned with the supervised training objective?

(From the code and docs, I see temperature=0.0 for inference/benchmark, but could not find details for the training set generation stage. If this has already been discussed/documented, please kindly point me to the relevant resource!)

Thanks in advance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: temperature/top_p for training data generation #33

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question: temperature/top_p for training data generation #33

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions