SC Logo
IEEE Logo
Logo
IEEE Logo

Video Hand Gestures Recognition Using Depth Camera and Lightweight CNN

Published in : IEEE Sensors Journal (Volume: 22, Issue: 14, July 2022)
Authors : David González León, Jade Gröli, Sreenivasa Reddy Yeduri, Daniel Rossier, Romuald Mosqueron, Om Jee Pandey, Linga Reddy Cenkeramaddi,
DOI : https://doi.org/10.1109/JSEN.2022.3181518
Summary Contributed by:  Linga Reddy Cenkeramaddi (Author)

Hand gesture recognition has gained significant interest in the areas of Human-Computer Interaction, Psychology and Neuroscience, Sign Language Recognition and Translation, Anthropology and Linguistics, Software Engineering, Robotics and Artificial Intelligence (AI), Education, Medical and Rehabilitation Sciences, Music, and Performing Arts.

Many works have been proposed in the literature for hand gesture recognition. Traditional methods of hand gesture recognition often rely on RGB data, which can be limited in accurately capturing hand movements and positions due to challenges such as varying lighting conditions and occlusions. In contrast, depth cameras provide more reliable spatial information by capturing depth data, enabling more accurate and robust gesture recognition.

This paper introduces a methodology employing a convolutional neural network (CNN) model for video-based hand gesture recognition.

Video-based gesture recognition enhances human-computer interaction, making it more natural, flexible, and enriching interactive experiences in teaching, gaming, and control systems. CNNs have shown remarkable success in various computer vision tasks, including image recognition, object detection, and segmentation.

However, deploying CNNs for real-time applications can be challenging due to computational constraints, especially in embedded systems or devices with limited processing power.

The authors propose a lightweight deep CNN model designed explicitly for accurate hand gesture classification from video sequences to address this challenge, offering reliability and robustness. This lightweight CNN architecture includes convolutional, max-pooling, batch normalization, dense, dropout, and flattened layers aimed at efficient feature extraction, dimensionality reduction, and preventing overfitting. Performance evaluation involved accuracy, loss, and confusion matrix analysis, with training using different frame sampling methods.

A dataset of videos corresponding to six computer mouse activities such as scroll up, scroll down, scroll left, scroll right, zoom in, and zoom out using an RGB-D camera is created. The dataset contained two video sequences from both RGB and depth cameras. The dataset comprises 762 gesture sequences, each with RGB and depth versions, captured using an RGB-D camera with specific field-of-view dimensions. A custom Python script was employed to record and save video sequences from both streams.

The performance of the CNN model is evaluated by considering different video frame lengths. Further, we present the results corresponding to the accuracy, loss, and confusion matrix. It is shown that the proposed model has achieved an overall accuracy of 99.18%, 99.04%, and 99.16% when considering all frames in the video, 1 of 2 frames in the video, and 1 of 4 frames in the video. We also present the 10-fold cross-validation results for different video frame lengths corresponding to both RGB and depth videos. Finally, we compare the accuracy of the proposed lightweight CNN model with the state-of-the-hand gesture classification models.

Overall, the paper contributes to the advancement of video-based gesture recognition through the creation of a comprehensive dataset and the development of an efficient CNN model. The results demonstrate promising accuracy and reliability, paving the way for further applications in real-world scenarios requiring precise hand gesture recognition.

A non-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.
Copyright 2023 IEEE – All rights reserved. Use of this website signifies your agreement to the IEEE Terms and Conditions
This site is also available on your smartphone.