A computer vision project for detecting hands and recognizing hand gestures using YOLO.
This project implements a two-phase approach:
- Phase 1: Hand detection - Detect hands in images
- Phase 2: Gesture recognition - Classify specific hand gestures (👌, 👍, ✌️, etc.)
# Start interactive data collection
python collect_data.py
# Check collected data statistics
python collect_data.py --statsCollect 50-100 images per gesture with varied:
- Hand positions and angles
- Distances from camera
- Lighting conditions
- Backgrounds
# Phase 1: Train hand detector
python train_hand_detector.py hand --epochs 30
# Phase 2: Train gesture classifier (after hand detector is ready)
python train_hand_detector.py gesture --epochs 50
# Or train both phases
python train_hand_detector.py bothThe project includes a ready-to-deploy Gradio app for Hugging Face Spaces:
- Create a new Space on Hugging Face
- Upload your trained models to
models/directory - Copy
deployment/huggingface/contents to your Space - The app will automatically load your models
hand-sign-detection/
├── collect_data.py # Webcam data collection tool
├── train_hand_detector.py # Training pipeline
├── models/ # Trained models (auto-created)
│ ├── hand_detector_v1.pt
│ └── gesture_classifier_v1.pt
├── data/ # Training data
│ └── raw/ # Collected images organized by gesture
│ ├── ok/
│ ├── thumbs_up/
│ ├── peace/
│ └── ...
├── deployment/
│ └── huggingface/ # Hugging Face deployment
│ ├── app.py # Gradio interface
│ └── requirements.txt
└── claude.scratchpad.md # Experiment tracking
- 👌 OK sign (
ok) - 👍 Thumbs up (
thumbs_up) - ✌️ Peace sign (
peace) - ✊ Fist (
fist) - 👉 Pointing (
point) - 🤘 Rock sign (
rock) - 👋 Wave (
wave) - ✋ Stop (
stop) - 🖐️ Open hand (
hand) - Background/No hand (
none)
pip install -r requirements.txtMain dependencies:
ultralytics- YOLO implementationopencv-python- Image processinggradio- Web interfacetorch- Deep learning framework
Once deployed to Hugging Face, your model will have:
- Live webcam input
- Image upload
- Real-time hand detection with bounding boxes
- Gesture classification with confidence scores
- Interactive demo interface
- Data Quality: More diverse data > more epochs
- Balanced Dataset: Collect similar amounts for each gesture
- Include Negatives: Collect "none" class (no hands) to reduce false positives
- Test Incrementally: Train for few epochs first to validate approach
- Monitor Training: Watch for overfitting (val loss increasing)
- Add more gesture types
- Implement real-time video processing
- Add hand tracking (not just detection)
- Create mobile app version
- Add gesture sequence recognition
MIT