Skip to content

DevManpreet5/LLM_from_Scratch

Repository files navigation

LLM_from_Scratch

This repository documents my journey of building a Large Language Model (LLM) from scratch

Daily Progress

Day 1: Understanding LLMs & Revisiting Fundamentals

Day 2: Tokenization & Preprocessing

Day 3: Input-Target Pairs & Token Embedding

Day 4: Basics of Attention Mechanism

  • Read Attention Is All You Need
  • Explored simplified , self ,causal , multi-head attention, why RNN fails
  • History of RNN , LSTM , Transformer
  • Learned about Bahdanau attention

Day 5: Attention Mechanism from Scratch

  • Implemented simplified attention mechanism with non-trainable weights from scratch → SimplifiedAttention.ipynb

  • Implemented self-attention mechanism using key, query, and value matrices with trainable weights from scratch → SelfAttention.ipynb

  • Implemented casual-attention mechanism with dropout from scratch → CasualAttention.ipynb

Day 6: Multihead Attention Mechanism from Scratch

  • Implemented Multihead Attention Mechanism from Scratch using simple Implementation → Multihead.ipynb

  • Implemented Multihead Attention Mechanism from Scratch with weight split and one class( no wrapper class ) → Multihead.ipynb

Day 7: GPT-2 Core Components

Day 8: Transformer Block & Training

Day 9: Evaluation and Sampling

Day 10: Weight Loading

Day 11: Classification Fine-Tuning

Day 12: Instruction Fine-Tuning

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published