This project implements a simple Transformer model using PyTorch for language translation.
• Encoder–decoder Transformer with multi-head attention.
• Positional embeddings and masking (padding + autoregressive).
• Synthetic dataset with digits and letters.
• Training loop and greedy decoding prediction.
-
Install dependencies:
pip install torch numpy -
Train the model:
python main.py -
Make predictions: Included in
main.pyafter training.
• Support larger vocabularies and sequences.
• Experiment with more Transformer layers and heads.
• Integrate with real datasets for practical tasks.