- Rutuja Ajay Kolte - [email protected]
- Mahek Ajay Salia - [email protected]
- Reshmika Sreenath Nambiar - [email protected]
- Prerna Jagesia - [email protected]
- Anuj Raghani
- Bhavya Sheth
- Owais Hetavkar
- Vedant Paranjape
The goal of our project was to train a machine learning algorithm capable of classifying images of different hand gestures (such as fists, palm, etc.) and use it for gesture detection and recognition.
We have used the Hand Gesture Recognition Database from Kaggle.
- First we load the images from proj.zip
- Their sizes are reduced and their color space is turned to gray. They are stored in array X while their labels are stored in array Y.
- The model is constructed using Tensorflow and Keras.
model = Sequential()
model.add(Conv2D(32, (5, 5), activation='relu', input_shape=(120, 320, 1)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(10, activation='softmax'))
- The model is then configured and trained.
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(X, Y, epochs=5, batch_size=64, verbose=2, validation_data=(X, Y))
- The model is then saved as an HDF5 file.
- The model from GestureRecognition.h is loaded.
- The image is taken from webcam.
- It is resized and changed to grayscale.
- Running average method is used for background subtraction.
averageValue1 = np.float32(img)
while True:
try:
filename = take_photo()
img = cv2.imread('/content/photo.jpg') # Reads the picture taken from webcam
cv2.accumulateWeighted(img, averageValue1, 0.02) # Updates the running average
resultingFrames1 = cv2.convertScaleAbs(averageValue1) # Converts the matrix elements to absolute values and converts the result to 8-bit
cv2_imshow(resultingFrames1) # Background using Running Average Method
m = cv2.subtract(img,resultingFrames1) # Foreground by subtracting background from original image
except Exception as err:
# Errors will be thrown if the user does not have a webcam or if they do not grant the page permission to access it
print(str(err))
- The model then predicts the gesture and prints it.
gesture = ("down", "palm", "l", "fist", "fist_moved", "thumb", "index", "ok", "palm_moved", "c")
prediction = model.predict(np.expand_dims(m, axis = 0)) # Makes predictions
ans = np.argmax(prediction[0])
print(prediction[0][ans]) # Prints probability of prediction
print(gesture[ans]) # Prints predicted gesture
- GitHub repo link: Link to repository
- Drive link: Drive link here
Tools and technologies that we learnt and used in the project.
- Python
- Open CV and CNN
- Jupyter notebook
- Machine learning
- Clone the CodeBrewers repository
git clone https://github.com/Rutuja-Kolte/CodeBrewers
- Open Google Drive and create a folder named CodeBrewers.
- Upload all files from the CodeBrewers repository on your PC to Google Drive.
- Also add the dataset from Kaggle and name it proj.zip
- Clone the CodeBrewers repository
git clone https://github.com/Rutuja-Kolte/CodeBrewers
- Go to the drive link and copy the folder and save it in your own drive.
- Right click on CodeBrewers.ipynb file in Google Drive.
- Click on open with Google Colab.
- Run the code.
- Open CodeBrewers.ipynb from the CodeBrewers repository in Google Colab.
- Run the code.
- Right click on GestureDetector.ipynb file in Google Drive.
- Click on open with Google Colab.
- Run the code.
- Open GestureDetector.ipynb from the CodeBrewers repository in Google Colab.
- Run the code.
- Touchless user interface is an emerging type of technology in relation to gesture control. One type of touchless interface uses the bluetooth connectivity of a smartphone to activate a company's visitor management system. This prevents having to touch an interface during the COVID-19 pandemic.
- Hand gesture recognition has great value in sign language recognition and sign language interpreters for the disabled.
- In cranes, this can be used instead of remotes so that easy picking and shedding of load can be done at difficult locations.
The project can be linked to a Media player such as VLC and the gestures can be used to control the video like increasing or decreasing its volume or fast forwarding and rewinding the video. Also, instead of using a mouse the gestures can also be used to control your mouse pointer.
Currently, the model used cannot recognise when there are no gestures detected. This functionality can be added as well.
In the above project we have used only static gestures. It can be modified to include dynamic gestures (swiping your fist to the right or left, moving your finger up and down, etc.).