🤖 Transformer From Scratch: Learning by Practice README 📚

Introduction 👋

Welcome to my GitHub repository where I've embarked on a journey to deepen my understanding of transformers, specifically the architecture introduced in the groundbreaking paper "Attention Is All You Need". By taking a "learn by doing" approach, I've constructed a transformer from scratch which allowed me to thoroughly grasp various concepts such as encoders, decoders, and the intricate mechanics of attention mechanisms.

This hands-on project has been immensely beneficial in solidifying my knowledge and skills pertaining to the transformer architecture.

Project Structure and Descriptions 📁

This section outlines the file structure of the program and provides descriptions for each component of the repository:

config.py: Contains global parameters that configure various aspects of the transformer model.
data/: This directory holds the raw datasets used to train and evaluate the transformer model.
data_cleansing.ipynb: A Jupyter notebook for performing data cleansing operations to prepare the data for model training.
data_loader.py: The dataset loader, responsible for loading data and preparing it in a format suitable for the transformer.
dataset_splitter.py: Utilized for splitting the dataset into training, validation, and test sets in a systematic manner.
model.py: Includes all the components of the transformer model such as layers, attention mechanisms, and connection blocks.
pltrain.py: This script is used for training the transformer model. It harnesses the power of PyTorch Lightning to streamline the training process.
predict.ipynb: A Jupyter notebook used to run predictions with the trained model. It demonstrates the model's ability to generate outputs given new inputs.
requirements.txt: Lists all the Python dependencies required to run the model. Ensure you install these packages before trying to run the model.
tokenizer.py: Responsible for loading a tokenizer that is used to convert text into tokens which the model can understand.
tokenizer_trainer.py: This script is used for training a tokenizer on your dataset, specifically using Byte Pair Encoding (BPE) for token splitting.
utils.py: Defines various utility functions that are used across the repository's scripts and notebooks.

How to Use 🚀

To get started with this transformer model:

Clone the repository to your local machine.
Install the dependencies listed in requirements.txt.
Prepare your dataset and place it in the data/ directory.
Train your tokenizer with tokenizer_trainer.py.
Execute pltrain.py to start the training process.
Use predict.ipynb to validate the performance of your trained model.

Happy coding and learning! 😊

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 Transformer From Scratch: Learning by Practice README 📚

Introduction 👋

Project Structure and Descriptions 📁

How to Use 🚀

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
__pycache__		__pycache__
data		data
.gitignore		.gitignore
README.md		README.md
config.py		config.py
data_cleansing.ipynb		data_cleansing.ipynb
data_loader.py		data_loader.py
dataset_splitter.py		dataset_splitter.py
model.py		model.py
pltrain.py		pltrain.py
predict.ipynb		predict.ipynb
requirements.txt		requirements.txt
tokenizer.py		tokenizer.py
tokenizer_trainer.py		tokenizer_trainer.py
utils.py		utils.py

billybillysss/transformer

Folders and files

Latest commit

History

Repository files navigation

🤖 Transformer From Scratch: Learning by Practice README 📚

Introduction 👋

Project Structure and Descriptions 📁

How to Use 🚀

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages