Air-writing refers to virtually writing linguistic characters through hand gestures in three-dimensional space with six degrees of freedom. This paper proposes a generic video camera-aided convolutional neural network (CNN) based air-writing framework. Gestures are performed using a marker of fixed color in front of a generic video camera, followed by color-based segmentation to identify the marker and track the trajectory of the marker tip. A pre-trained CNN is then used to classify the gesture. The recognition accuracy is further improved using transfer learning with the newly acquired data. The performance of the system varies significantly on the illumination condition due to color-based segmentation. In a less fluctuating illumination condition, the system is able to recognize isolated unistroke numerals of multiple languages. The proposed framework has achieved 97.7%, 95.4% and 93.7% recognition rates in person independent evaluations on English, Bengali and Devanagari numerals, respectively.
翻译:空文是指在三维空间以六度自由的手势写语言字符。 本文建议使用通用视频相机辅助神经神经网络(CNN)的空写框架。 手势在通用视频相机前使用固定颜色的标记进行, 之后是基于颜色的分解, 以识别标记并跟踪标记提示的轨迹。 然后使用经过预先训练的CNN对手势进行分类。 使用新获得的数据的传输学习进一步提高识别准确度。 系统性能因基于颜色的分化而有很大差异。 在较不易波动的照明状态下, 系统能够识别多种语言的孤立的单方形数字。 拟议的框架在对英语、 孟加拉语 和 德瓦纳加里 数字进行的个人独立评估中分别实现了97.7%、 95.4% 和93.7% 的识别率 。</s>