Skip to content

Script to push the trained models and tokenizer to HF#45

Merged
chandanms merged 4 commits intosimple-stories:devfrom
chandanms:feature/push_model_to_hf
Sep 15, 2025
Merged

Script to push the trained models and tokenizer to HF#45
chandanms merged 4 commits intosimple-stories:devfrom
chandanms:feature/push_model_to_hf

Conversation

@chandanms
Copy link
Collaborator

Description

Script to push the models and tokenizer to HF. Stores the tokenizer during training and uses that to convert to HF and push to HF.

Related Issue

Closes #42

Motivation and Context

Makes it easier to upload the model and tokenizer to HF. The accompanying tokenizer is saved during training rather than pointing the path to it later to upload to HF.

How Has This Been Tested?

Added a test case to test for conversion of tokenizer to HF compatible format

Does this PR introduce a breaking change?

No

@chandanms chandanms marked this pull request as ready for review August 17, 2025 18:48
@chandanms chandanms requested a review from danbraunai August 17, 2025 18:48
@chandanms chandanms changed the title Script to push the trained models and tokenizer along to HF Script to push the trained models and tokenizer to HF Aug 17, 2025
Copy link
Collaborator

@danbraunai danbraunai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes made look good. Though I thought you'd want to update the push_to_hf.py script so that it uploads all the nice information that you previously manually added. Notably, the README that you see at e.g. https://huggingface.co/SimpleStories/SimpleStories-1.25M.

You can do that in a new PR, though you probably want to just do it here.

"""Try to resolve a tokenizer file path from the final_config.yaml next to the checkpoint.
def find_saved_tokenizer(output_dir: Path) -> Path | None:
"""Find the saved tokenizer in the training output directory."""
# Look for tokenizer.json saved during training
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary comment

return None

# As a last resort, if the file name matches a known tokenizer in the repo, use it
# If not found, return default tokenizer
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary

@chandanms chandanms merged commit 22012d1 into simple-stories:dev Sep 15, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants