⚒️ Letter-Forge

🧠 Meaning of the Name

Letter-Forge symbolizes both the ancient craft of shaping written language
and the modern act of building models that learn linguistic structure from scratch.

A forge where letters learn to think.

Letter-Forge is a from-scratch implementation of Transformer architectures for character-level learning and language modeling.
It explores how attention, memory, and positional structure can emerge from simple sequences of letters - transforming raw symbols into learned meaning.

🧠 Overview

Letter-Forge is built to craft language understanding from the ground up.
It begins with a minimalist Transformer Encoder that learns counting and pattern recognition at the character level,
and extends to a full Transformer Language Model capable of predicting and generating text sequences.

Each component - from self-attention to positional encoding - is implemented manually to illustrate the inner mechanics of modern deep learning models.

🏗️ Architecture Highlights

Component	Description
Custom Transformer Encoder	Built from first principles using PyTorch layers (`Linear`, `Softmax`, `ReLU`) — no off-the-shelf Transformer modules.
Self-Attention Mechanism	Implements single-head attention using learned queries, keys, and values. Visualizes attention maps between character positions.
Positional Encoding	Supports both learned and sinusoidal positional embeddings to inject order awareness into the model.
Transformer Language Model (LM)	Extends the encoder into a causal language model predicting the next character given context.
Visualization & Analysis	Generates heatmaps showing how the model “looks back” over previous symbols while learning structure.

🧩 Project Structure

letter-forge/
│
├── part-1_encoder/              # Transformer Encoder (character-level)
│   ├── data/
│   │   ├── lettercounting-train.txt
│   │   └── lettercounting-dev.txt
│   ├── letter_counting.py       # Driver script
│   ├── transformer.py           # Core encoder + attention implementation
│   └── utils.py                 # Indexer & helper utilities
│
├── part-2_lm/                   # Transformer Language Model
│   ├── data/
│   │   ├── text8-100k.txt
│   │   ├── text8-dev.txt
│   │   └── text8-test.txt
│   ├── lm.py                    # Driver & evaluation (perplexity, sanity checks)
│   ├── transformer_lm.py        # LM model + training loop
│   └── utils.py
│
├── sandbox_utils/               # Development & testing scripts
│   ├── data_pipeline_verifier.py
│   ├── attention_pe_module_test.py
│   ├── attention_validation_suite.py
│   └── repro_training_logger.py
│
├── artifacts/                   # Saved models & metadata
├── plots/                       # Attention heatmaps & visual outputs
└── README.md

⚙️ Installation

# Clone the repository
git clone https://github.com/<your-username>/letter-forge.git
cd letter-forge

# Create and activate a virtual environment
python -m venv venv
source venv/bin/activate     # or .\venv\Scripts\activate on Windows

# Install dependencies
pip install torch numpy matplotlib

🚀 Usage

🧮 Part 1 – Transformer Encoder

Train and test the character-level counting model:

cd part-1_encoder
python letter_counting.py

Predicts how many times each letter has appeared before in a sequence.
Visualizes attention maps showing how the model “looks back” over earlier tokens.

Sample Attention Visualization:

🔡 Part 2 – Transformer Language Model

Train a Transformer to predict the next character in a text sequence:

cd part-2_lm
python lm.py --model NEURAL

Learns from the text8 dataset (100k character subset).
Evaluates on perplexity and token-level likelihood.
Produces valid probability distributions for every step.

📊 Results Summary

Model	Task	Metric	Result
Transformer Encoder	Character counting (BEFORE)	Accuracy	98.3 %
Transformer Encoder	Character counting (BEFOREAFTER)	Accuracy	97–99 % (tuned)
Transformer LM	Next-char prediction (text8)	Perplexity	≤ 7 (target)
Attention Visualization	Pattern detection	Highlights same-character attention clusters

The model successfully learns to identify prior occurrences of letters and extends this ability to generate context-aware text sequences.

🔥 Key Insights

Self-Attention = Contextual Memory:
Each token attends to relevant predecessors, forming a learned memory of prior occurrences.
Positional Encoding Enables Order Awareness:
Without positional information, the model treats input as a bag of symbols; with it, order emerges.
From Counting to Composition:
The same underlying structure that counts characters can generate language — showing the continuum from perception to composition.

📁 Artifacts

Checkpoints: artifacts/model_*.pt
Metadata: artifacts/run_meta.json
Plots: plots/*.png – attention heatmaps, loss curves, etc.

🧪 Sandbox Utilities

Script	Purpose
`data_pipeline_verifier.py`	Validates dataset shapes and preprocessing pipeline.
`attention_pe_module_test.py`	Unit-tests Positional Encoding and Attention modules.
`attention_validation_suite.py`	Verifies attention masks and tensor consistency.
`repro_training_logger.py`	Reproducible 3-epoch experiment logger (saves artifacts).

🪶 Philosophy

Letter-Forge is built on a simple idea:
that language understanding isn’t magic - it’s forged through repeated interaction between memory, order, and meaning.
By crafting each layer manually, we can see how modern intelligence emerges, one letter at a time.

🧑‍💻 Author & Maintainer

d-senyaka
AI & Data Science Developer · Deep Learning Enthusiast · Language Technology Researcher

⚖️ License

This project is released under the MIT License.
You are free to use, modify, and distribute it with attribution.

⭐ Acknowledgements

Inspired by open research in attention mechanisms and neural sequence modeling.
Crafted with curiosity, patience, and an appreciation for both language and logic.

“To forge a mind of letters is to understand the art of attention.”

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
data		data
experiments		experiments
part_1_encoder		part_1_encoder
part_2_lm		part_2_lm
sandbox_utils		sandbox_utils
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

⚒️ Letter-Forge

🧠 Meaning of the Name

🧠 Overview

🏗️ Architecture Highlights

🧩 Project Structure

⚙️ Installation

🚀 Usage

🧮 Part 1 – Transformer Encoder

🔡 Part 2 – Transformer Language Model

📊 Results Summary

🔥 Key Insights

📁 Artifacts

🧪 Sandbox Utilities

🪶 Philosophy

🧑‍💻 Author & Maintainer

⚖️ License

⭐ Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

⚒️ Letter-Forge

🧠 Meaning of the Name

🧠 Overview

🏗️ Architecture Highlights

🧩 Project Structure

⚙️ Installation

🚀 Usage

🧮 Part 1 – Transformer Encoder

🔡 Part 2 – Transformer Language Model

📊 Results Summary

🔥 Key Insights

📁 Artifacts

🧪 Sandbox Utilities

🪶 Philosophy

🧑‍💻 Author & Maintainer

⚖️ License

⭐ Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages