How to load the fine-tuned model

Thanks for your excellent work!!

I am fine-tuning on a single GPU and the model outputs bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt and mp_rank_00_model_states.pt. How should I load the model weights? 

Can I just replace the content of the command line parameter --checkpoint checkpoints/harmon_1.5b.pth with mp_rank_00_model_states.pt?
On the other hand, I used zero_to_fp32.py to convert the model weights, but I see that the output is not the same as the model structure in huggingface. How should I load it?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to load the fine-tuned model #4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

How to load the fine-tuned model #4

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions