Real-Time Sign Language Detection

Overview

This project demonstrates real-time detection of sign language numbers (0-9) using hand gestures. The detection system utilizes a webcam to capture hand gestures, employs MediaPipe for hand pose estimation, and then predicts the corresponding number using a trained machine learning model (Random Forest). The model can accurately predict the numbers based on the hand landmarks' positions.

Sign Language Numbers (0-9)

––––––––––––––––––––––––––––––––––––––––––––

Demo Video

Requirements

Make sure you have the following dependencies installed:

Mediapipe: 0.10.14
OpenCV (cv2): 4.10.0
Scikit-learn: 1.5.2

You can install them via pip:

pip install mediapipe==0.10.14 opencv-python==4.10.0 scikit-learn==1.5.2

Dataset

I collected the dataset using my webcam, generating 1000 images for each class (numbers 0-9) using the collect_images module. Then, I processed these images with the create_dataset module to extract x and y coordinates from hand estimation landmarks. Each landmark array was labeled with the respective number and saved as a pickle file.

One of the main challenges in creating the dataset was ensuring accurate predictions for different hand orientations, distances from the webcam, and positions in the frame. I handled this challenge by capturing diverse images in various conditions, but the model's accuracy can still improve with more frames at different distances and positions.

The complete dataset is around 6GB, so it's not uploaded here, but you can collect your own dataset using the aforementioned modules.

Model

The machine learning model used is a Random Forest classifier, which achieved over 99% accuracy. The model was trained using the train_classifier module on the extracted coordinates from hand landmarks, and the trained model is saved as a .p file for future predictions.

Usage

To use the system, run the main.ipynb script. The script captures video from your webcam, uses MediaPipe for hand pose estimation, and passes the extracted landmarks' coordinates to the saved Random Forest model. The model predicts the number based on the hand gestures, displaying the result on a box around the hand.

The system predicts numbers in real time, showing a number from 0-9 based on your hand gesture and position in each frame.

License

This project is licensed under the MIT License. Feel free to use it in your projects or contribute to improve it.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
0-to-9.jpg		0-to-9.jpg
LICENSE		LICENSE
README.md		README.md
collect_images.ipynb		collect_images.ipynb
create_dataset.ipynb		create_dataset.ipynb
main.ipynb		main.ipynb
train_classifier.ipynb		train_classifier.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Real-Time Sign Language Detection

Overview

Table of Contents

Requirements

Dataset

Model

Usage

License

About

Releases

Packages

Languages

License

DanialSoleimany/Real-Time-Sign-Language-Detection-Numbers

Folders and files

Latest commit

History

Repository files navigation

Real-Time Sign Language Detection

Overview

Table of Contents

Requirements

Dataset

Model

Usage

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages