In this work, a real-time hand gesture recognition system-based human-computer interface (HCI) is presented. The system consists of six stages: (1) hand detection, (2) gesture segmentation, (3) use of six pre-trained CNN models by using the transfer-learning method, (4) building an interactive human-machine interface, (5) development of a gesture-controlled virtual mouse, (6) use of Kalman filter to estimate the hand position, based on that the smoothness of the motion of pointer is improved. Six pre-trained convolutional neural network (CNN) models (VGG16, VGG19, ResNet50, ResNet101, Inception-V1, and MobileNet-V1) have been used to classify hand gesture images. Three multi-class datasets (two publicly and one custom) have been used to evaluate the model performances. Considering the models' performances, it has been observed that Inception-V1 has significantly shown a better classification performance compared to the other five pre-trained models in terms of accuracy, precision, recall, and F-score values. The gesture recognition system is expanded and used to control multimedia applications (like VLC player, audio player, file management, playing 2D Super-Mario-Bros game, etc.) with different customized gesture commands in real-time scenarios. The average speed of this system has reached 35 fps (frame per seconds), which meets the requirements for the real-time scenario.
翻译:在这项工作中,展示了基于基于人体计算机的实时手势识别系统界面(HCI),该系统由六个阶段组成:(1) 手工检测,(2) 手势分割,(3) 使用六种经过预先训练的CNN模型,使用转移学习方法,(4) 建立交互式人机界面,(5) 开发一个手势控制的虚拟鼠标,(6) 使用Kalman过滤器来估计手势,其依据是指示器运动的平稳性能得到改善。六种预先训练的进化神经网络模型(VGG16、VGG19、ResNet50、ResNet101、Incepion-V1和MovedNet-V1)已经用于对手势图像进行分类。使用三个多级数据集(两个公开和一个习惯)来评价模型的性能。考虑到模型的性能,Inception-V1在精确、回顾和F-培训前的模型(VGLC19)、ResNet50、ResNet101、Inceptionion-V1和M-VPM-VPS-S-Simal-S-Simage-Sideal-Side-Side-Side-Side-Side-LA-Side-Side-Side-Sidea-Side-Simpal-Side-Side-Side-Simpal-Side-Simpal-Sideal-Simpal-Sideal-Sideal-SD)方面,其实际操作要求已经扩大和使用,并用于并用于控制了并使用并控制了35机算。