Skip to content

How to load the fine-tuned model #4

@NROwind

Description

@NROwind

Thanks for your excellent work!!

I am fine-tuning on a single GPU and the model outputs bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt and mp_rank_00_model_states.pt. How should I load the model weights?

Can I just replace the content of the command line parameter --checkpoint checkpoints/harmon_1.5b.pth with mp_rank_00_model_states.pt?
On the other hand, I used zero_to_fp32.py to convert the model weights, but I see that the output is not the same as the model structure in huggingface. How should I load it?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions