A repository for Group 4's Applied Machine Learning project.
Sign language is a vital form of communication for individuals who are deaf or hard of hearing, but the diversity of sign languages and their complexity pose significant challenges to automatic recognition. With over 300 sign languages globally, each with unique hand motions, facial expressions, and body movements, creating robust and accurate sign language recognition (SLR) systems is a complex task. SLR systems aim to bridge the communication gap between hearing and non-hearing individuals, enabling easier social interaction and inclusion. Recent advancements in machine learning, particularly in computer vision, have significantly improved the capabilities of SLR systems. However, many challenges remain, including the difficulty of accurately recognizing hand gestures across different lighting conditions, occlusions, and various hand shapes. Despite these challenges, machine learning techniques such as Convolutional Neural Networks (CNNs) have demonstrated substantial promise in improving SLR system performance.
The ASL Alphabet Dataset comprises images of hand gestures representing letters in the American Sign Language. The dataset is organized into 29 folders, corresponding to 29 classes, which include the 26 letters of the alphabet and three additional classes: SPACE, DELETE, and NOTHING. The training set consists of 87,000 images, each measuring 200x200 pixels, with a well-balanced distribution across the 29 classes. The test set, in contrast, contains 29 images, one for each class, encouraging the use of real-world images during model testing.
The ASL Hand Gesture Digits Dataset focuses on recognizing numerical gestures, with images of hand gestures representing the digits 0 through 9. Each image is grayscale with a resolution of 400x400 pixels, providing high-quality input for computer vision tasks. The training set contains 570 images, with each class having 57 images. The test set is smaller, with 130 images (13 per class). This dataset is evenly distributed, ensuring balanced training across all digit classes.
The Word-Level American Sign Language (WLASL) dataset is the largest video-based dataset available for ASL recognition, featuring 2,000 common words used in everyday ASL. The inclusion of video data makes this dataset essential for extending recognition from static images to dynamic gestures, allowing models to capture the temporal aspect of sign language.