This project demonstrates the use of deep learning models to recognize American Sign Language (ASL) letters. It was created as a cumulative project for the Deep Learning with PyTorch course at Fanshawe College, showcasing the application of advanced neural network architectures in solving real-world problems.
- Model Comparison: Explore ResNet18, ResNet50, and a custom convolutional neural network (CNN) model.
- Real-Time Predictions: Get predictions for ASL letters with confidence scores displayed as a bar chart.
- Custom Image Upload: Test the models with your own ASL letter images.
- Dataset Visualization: View samples from the Sign Language MNIST dataset.
- Interactive Dashboard: A user-friendly Gradio interface for seamless interaction.
Ensure you have the following installed:
- Python 3.8 or later
- Required libraries:
torch,torchvision,gradio,pandas,numpy,plotly,Pillow
Install the dependencies using pip:
pip install requirements.txtgit clone https://github.com/yourusername/sign-language-recognition.git
cd sign-language-recognition-
Download the Dataset
- Download the Sign Language MNIST dataset.
- Extract the dataset and place the CSV files (
sign_mnist_train.csv,sign_mnist_test.csv) in theExtracted_SignLanguageMNISTfolder.
-
Train the Models
If you haven't already trained the models, use the training script provided in the repository to train and save the models (
ResNet18,ResNet50, and the custom CNN). -
Launch the Dashboard
Run the app_signlanageMNIST.ipynb script and launch the dashboard.
-
Access the Dashboard
Open the Gradio app in your browser (usually at
http://127.0.0.1:7860/).
sign-language-recognition/
│
├── Extracted_SignLanguageMNIST/
│ ├── sign_mnist_train.csv
│ ├── sign_mnist_test.csv
│
├── saved_models/
│ ├── trained_resnet18.pth
│ ├── trained_resnet50.pth
│ ├── trained_custom.pth
│
├── main.py # Main script to launch the dashboard
├── README.md # Project documentation
├── requirements.txt # List of dependencies
- Model Inference: The selected model processes the input image to predict the ASL letter.
- Confidence Visualization: Confidence scores for all classes are displayed as a bar chart.
- Real-Time Updates: The dashboard updates predictions as you interact with it, providing an intuitive user experience.
The models are trained on the Sign Language MNIST dataset. This dataset contains 28x28 grayscale images of ASL letters, excluding J and Z, as they involve motion.
- A deep residual network with 18 layers.
- Designed to handle vanishing gradients effectively using residual connections.
- A deeper version of ResNet with 50 layers.
- Suitable for large-scale image recognition tasks.
- A lightweight convolutional neural network tailored for the ASL recognition task.
- Includes convolutional layers, pooling layers, and fully connected layers.
This project was created as a cumulative project for the Deep Learning with PyTorch course at Fanshawe College. It demonstrates the application of deep learning techniques in accessibility-focused technology, providing a foundation for further research and development in sign language recognition.
- Add support for real-time camera input to recognize ASL letters.
- Implement heatmap visualizations (e.g., Grad-CAM) to interpret model predictions.
- Train models on larger datasets for improved accuracy.
Contributions are welcome! Please follow these steps:
- Fork the repository.
- Create a new branch for your feature or bug fix.
- Commit your changes.
- Submit a pull request.
This project is licensed under the MIT License. See the LICENSE file for details.
For more information or to contribute to this project, please reach out:
- Author: Paige Berrigan
- GitHub: @paigeberrigan
- Email: paige@interweavemediagroup.ca