-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
Hi, thanks for the nice work.
I'm trying to reproduce paper's result but notice that the hyperparameter you provide in this repositary (by pretraining script, config.json ) is a little different from your paper (ex : learning rate, gradient accumulation steps). I'm wondering which version should be used to reproduce the paper result, and which version of hyperparameter you use to get the checkpoint you provide?
Thanks for the reading!
Metadata
Metadata
Assignees
Labels
No labels