-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Description
Thanks for your excellent work!!
I am fine-tuning on a single GPU and the model outputs bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt and mp_rank_00_model_states.pt. How should I load the model weights?
Can I just replace the content of the command line parameter --checkpoint checkpoints/harmon_1.5b.pth with mp_rank_00_model_states.pt?
On the other hand, I used zero_to_fp32.py to convert the model weights, but I see that the output is not the same as the model structure in huggingface. How should I load it?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels