From bca1c2167b5dfebd95e3530ec9bcd50d9ff693a9 Mon Sep 17 00:00:00 2001
From: chandanms <mschandan96@gmail.com>
Date: Fri, 15 Aug 2025 22:27:30 +0200
Subject: [PATCH] Accidently added the read me to wrong folder. Fixed now

---
 README.md                      | 22 ++++++-------
 simple_stories_train/README.md | 58 ----------------------------------
 2 files changed, 11 insertions(+), 69 deletions(-)
 delete mode 100644 simple_stories_train/README.md

diff --git a/README.md b/README.md
index cda30a6..24a4919 100644
--- a/README.md
+++ b/README.md
@@ -1,14 +1,9 @@
 # simple_stories_train
 
-Project for training small LMs. Designed for training on SimpleStories, an extension of
-[TinyStories](https://arxiv.org/abs/2305.07759).
+Training framework for small language models using SimpleStories, a large-scale synthetic dataset of over 2 million short stories in simple language.
 
-
-- Training script is based on the efficeint [train_gpt2.py](https://github.com/karpathy/llm.c/blob/master/train_gpt2.py) in [llm.c](https://github.com/karpathy/llm.c) (licensed
-  under MIT ((c) 2024 Andrei Karpathy))
-- Some model architecture implementations are based on
-  [TransformerLens](https://github.com/TransformerLensOrg/TransformerLens) (licensed under
-  MIT ((c) 2022 TransformerLensOrg)).
+**Paper:** [Parameterized Synthetic Text Generation with SimpleStories](https://arxiv.org/abs/2504.09184)  
+**Models & Dataset:** [🤗 SimpleStories on Hugging Face](https://huggingface.co/SimpleStories)
 
 ## Installation
 
@@ -37,8 +32,8 @@ make test-all  # Run all tests
 ## Usage
 
 ### Training a model
-```
-python train_llama.py [PATH/TO/CONFIG.yaml] [--key1 value1 --key2 value2 ...]
+```bash
+python -m simple_stories_train.train [PATH/TO/CONFIG.yaml] [--key1 value1 --key2 value2 ...]
 ```
 where
 - `PATH/TO/CONFIG.yaml` contains the training config. If no path is provided, a default config will be used.
@@ -49,10 +44,15 @@ If running on CPU, you may need to set `--compile=False`.
 
 To run on multiple GPUs, use
 ```
-torchrun --standalone --nproc_per_node=N train_llama.py ...
+torchrun --standalone --nproc_per_node=N -m simple_stories_train.train ...
 ```
 where `N` is the number of GPUs to use.
 
 ### Logging with Weights & Biases
 To track training with Weights & Biases, you can set the WANDB_PROJECT and WANDB_API_KEY variables in
 `.env`. API keys can be obtained from your [Weights & Biases account settings](https://wandb.ai/settings).
+
+## Acknowledgments
+
+- Training script is based on the efficient [train_gpt2.py](https://github.com/karpathy/llm.c/blob/master/train_gpt2.py) in [llm.c](https://github.com/karpathy/llm.c) (licensed under MIT ((c) 2024 Andrej Karpathy))
+- Some model architecture implementations are based on [TransformerLens](https://github.com/TransformerLensOrg/TransformerLens) (licensed under MIT ((c) 2022 TransformerLensOrg))
diff --git a/simple_stories_train/README.md b/simple_stories_train/README.md
deleted file mode 100644
index 24a4919..0000000
--- a/simple_stories_train/README.md
+++ /dev/null
@@ -1,58 +0,0 @@
-# simple_stories_train
-
-Training framework for small language models using SimpleStories, a large-scale synthetic dataset of over 2 million short stories in simple language.
-
-**Paper:** [Parameterized Synthetic Text Generation with SimpleStories](https://arxiv.org/abs/2504.09184)  
-**Models & Dataset:** [🤗 SimpleStories on Hugging Face](https://huggingface.co/SimpleStories)
-
-## Installation
-
-From the root of the repository, run one of
-
-```bash
-make install-dev  # To install the package, dev requirements and pre-commit hooks
-make install  # To just install the package (runs `pip install -e .`)
-```
-
-## Development
-
-Suggested extensions and settings for VSCode are provided in `.vscode/`. To use the suggested
-settings, copy `.vscode/settings-example.json` to `.vscode/settings.json`.
-
-There are various `make` commands that may be helpful
-
-```bash
-make check  # Run pre-commit on all files (i.e. pyright, ruff linter, and ruff formatter)
-make type  # Run pyright on all files
-make format  # Run ruff linter and formatter on all files
-make test  # Run tests that aren't marked `slow`
-make test-all  # Run all tests
-```
-
-## Usage
-
-### Training a model
-```bash
-python -m simple_stories_train.train [PATH/TO/CONFIG.yaml] [--key1 value1 --key2 value2 ...]
-```
-where
-- `PATH/TO/CONFIG.yaml` contains the training config. If no path is provided, a default config will be used.
-- `--key1 value1 --key2 value2 ...` override values in the config. Note that if you wish to update a
-  nested value, you must use dotted notation (e.g. `--train_dataset_config.name my_dataset`).
-
-If running on CPU, you may need to set `--compile=False`.
-
-To run on multiple GPUs, use
-```
-torchrun --standalone --nproc_per_node=N -m simple_stories_train.train ...
-```
-where `N` is the number of GPUs to use.
-
-### Logging with Weights & Biases
-To track training with Weights & Biases, you can set the WANDB_PROJECT and WANDB_API_KEY variables in
-`.env`. API keys can be obtained from your [Weights & Biases account settings](https://wandb.ai/settings).
-
-## Acknowledgments
-
-- Training script is based on the efficient [train_gpt2.py](https://github.com/karpathy/llm.c/blob/master/train_gpt2.py) in [llm.c](https://github.com/karpathy/llm.c) (licensed under MIT ((c) 2024 Andrej Karpathy))
-- Some model architecture implementations are based on [TransformerLens](https://github.com/TransformerLensOrg/TransformerLens) (licensed under MIT ((c) 2022 TransformerLensOrg))