Skip to content

AnonymousMorris/puny_llm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mini_llm

Toy project inspired by nanochat modded where people tried speedrun training GPT2 level model with modern improvements.

This is just for my own learning.

Layout

  • mini_llm/config.py: model, training, sampling, and W&B config defaults
  • mini_llm/data.py: dataset bootstrap, tokenizer, and batch sampling
  • mini_llm/model.py: GPT blocks, attention, forward pass, generation
  • mini_llm/train.py: device selection, optimizer setup, loss estimation, training loop, and W&B logging
  • mini_llm/cli.py: runnable entrypoint that reads config defaults and trains/samples
  • main.py: thin entrypoint

Run

uv run python main.py

Configs and Hyperparameters are in mini_llm/config.py

For Weights & Biases logging, enable TRAIN_CONFIG.wandb.enabled and set the project, run name, or mode there as needed.

About

Learning project for training LLM.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages