In this work, a real-time hand gesture recognition system-based human-computer interface (HCI) is presented. The system consists of six stages: (1) hand detection, (2) gesture segmentation, (3) use of five pre-trained convolutional neural network models (CNN) and vision transformer (ViT), (4) building an interactive human-machine interface (HMI), (5) development of a gesture-controlled virtual mouse, (6) use of Kalman filter to estimate the hand position, based on that the smoothness of the motion of pointer is improved. In our work, five pre-trained CNN (VGG16, VGG19, ResNet50, ResNet101, and Inception-V1) models and ViT have been employed to classify hand gesture images. Two multi-class datasets (one public and one custom) have been used to validate the models. Considering the model's performances, it is observed that Inception-V1 has significantly shown a better classification performance compared to the other four CNN models and ViT in terms of accuracy, precision, recall, and F-score values. We have also expanded this system to control some desktop applications (such as VLC player, audio player, file management, playing 2D Super-Mario-Bros game, etc.) with different customized gesture commands in real-time scenarios. The average speed of this system has reached 25 fps (frames per second), which meets the requirements for the real-time scenario. Performance of the proposed gesture control system obtained the average response time in milisecond for each control which makes it suitable for real-time. This model (prototype) will benefit physically disabled people interacting with desktops.
翻译:在这项工作中,展示了实时手势识别系统基于人体计算机界面(HCI)的实时手势识别系统(HCI),该系统由六个阶段组成:(1) 手摸检测,(2) 手势分割,(3) 使用五种预先训练的神经神经神经网络模型(CNN)和视觉变压器(VIT),(4) 建立交互式的人体机器接口(HMI),(5) 开发一个手势控制的虚拟鼠标,(6) 使用Kalman过滤器来估计手势位置,其依据是指示器运动的顺利性得到改善。在我们的工作中,已经使用了五个预先训练的CNN (VGG16、VGG19、ResNet50、ResNet101和Incepion-V1) 模型和VT 来对手势图像进行分类。使用两个多级数据集(一个公开和一个习惯)来验证模型。考虑到该模型的性能,Inception-V1 显示比其他四个CNN 模式和 VIT 的第二个机级(在精确、精度、回顾和F- 核心数值方面,我们还扩大了这个系统,这个系统与一些实时的Syal-modeal- mill 动作动作动作操作机的系统将达到某些的S- mactal-de-de-de-deal-listral-de-deal-de-deal-lifal manpeal-liction 这样的系统,这个系统将达到某种Seral-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-d-d-dal-dal-dal-dal-dal-d-d-dal-dal-dal-d-d-d-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-mand-mand-dal-d-d-d-d-d-d-d-dal-dal-dal-d-d-d-dal-d-d-d-d-d-dal-d-d-