Neural Machine Translation (NMT)

This repository contains a Neural Machine Translation (NMT) project implemented in Python. The project includes a Jupyter Notebook (NMTranslation.ipynb) that demonstrates the end-to-end process of building a French-to-English translation model using sequence modeling techniques. The model includes Luong attention and scaled dot-product attention mechanisms.

DATASET INFO

Dataset used - Tatoeba - French to English sentence pairs
No. of Sentence Pairs - 400,000 +

File Descriptions

config.py - Centralized configuration for hyperparameters, file paths, and device settings
train.py - Main script that loads data, builds tokenizers, trains both models, and saves them
inference.py - Provides translation functions and loads saved models for inference
eval.py - to evaluate model performance with bleu scores for seq2seq and attention models

Models Directory

attention.py - Implements Luong attention mechanism for the attention-based decoder
encoder.py - Contains both EncoderNoAttention and EncoderWithAttention classes
decoder.py - Contains both DecoderNoAttention and DecoderWithAttention classes
seq2seq.py - Wraps encoder-decoder pairs into complete Seq2Seq models

Utils Directory

tokenizer.py - Custom tokenizer class for converting text to sequences and vice versa
preprocessing.py - Functions for Unicode normalization, sentence preprocessing, and dataset processing
dataset.py - Masked cross-entropy loss and DataLoader creation utilities

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
Experiment Notebooks		Experiment Notebooks
models		models
utlis		utlis
README.md		README.md
config.py		config.py
eval.py		eval.py
inference.py		inference.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Neural Machine Translation (NMT)

File Descriptions

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Neural Machine Translation (NMT)

File Descriptions

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages