Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Rubin Bose, S.; * | Sathiesh Kumar, V.
Affiliations: Department of Electronics Engineering, Madras Institute of Technology Campus, Anna University, Chennai, India
Correspondence: [*] Corresponding author. S. Rubin Bose, Department of Electronics Engineering, Madras Institute of Technology Campus, Anna University, Chennai, India. E-mail: [email protected].
Abstract: The real-time perception of hand gestures in a deprived environment is a demanding machine vision task. The hand recognition operations are more strenuous with different illumination conditions and varying backgrounds. Robust recognition and classification are the vital steps to support effective human-machine interaction (HMI), virtual reality, etc. In this paper, the real-time hand action recognition is performed by using an optimized Deep Residual Network model. It incorporates a RetinaNet model for hand detection and a Depthwise Separable Convolutional (DSC) layer for precise hand gesture recognition. The proposed model overcomes the class imbalance problems encountered by the conventional single-stage hand action recognition algorithms. The integrated DSC layer reduces the computational parameters and enhances the recognition speed. The model utilizes a ResNet-101 CNN architecture as a Feature extractor. The model is trained and evaluated on the MITI-HD dataset and compared with the benchmark datasets (NUSHP-II, Senz-3D). The network achieved a higher Precision and Recall value for an IoU value of 0.5. It is realized that the RetinaNet-DSC model using ResNet-101 backbone network obtained higher Precision (99.21 %for AP0.5, 96.80%for AP0.75) for MITI-HD Dataset. Higher performance metrics are obtained for a value of γ= 2 and α= 0.25. The SGD with a momentum optimizer outperformed the other optimizers (Adam, RMSprop) for the datasets considered in the studies. The prediction time of the optimized deep residual network is 82 ms.
Keywords: Hand gesture recognition, RetinaNet, ResNet-101, CNN, human machine interaction
DOI: 10.3233/JIFS-210875
Journal: Journal of Intelligent & Fuzzy Systems, vol. 41, no. 6, pp. 6983-6997, 2021
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]