The Human-Machine Interaction (HMI) researchfield is an important topic in machine learning that has beendeeply investigated thanks to the rise of computing power in thelast years. The first time, it is possible to use machine learningto classify images and/or videos instead of the traditionalcomputer vision algorithms. The aim of this project is to builda symbiosis between a convolutional neural network (CNN)[1] and a recurrent neural network (RNN) [2] to recognizecultural/anthropological Italian sign language gestures fromvideos. The CNN extracts important features that later areused by the RNN. With RNNs we are able to store temporalinformation inside the model to provide contextual informationfrom previous frames to enhance the prediction accuracy. Ournovel approach uses different data augmentation techniquesand regularization methods from only RGB frames to avoidoverfitting and provide a small generalization error.
翻译:人类-海洋相互作用(HMI)研究领域是机器学习的一个重要课题,由于过去几年中计算机功率的上升而深入调查了这一课题。第一次,可以使用机器学习对图像和/或视频进行分类,而不是传统的计算机视觉算法。该项目的目的是在一个革命性神经网络[1] 和一个经常性神经网络[2] 之间建立一种共生共生关系,以承认视频中的文化/人类学意大利手语手势。CNN 提取了后来由RNN使用的重要特征。与RNNs一起,我们能够在模型中储存时间信息,从以前的框架中提供背景信息,以提高预测的准确性。我们的Novel方法使用不同的数据增强技术和正规化方法,从RGB框架中只使用这些数据增强技术和常规化方法来避免重叠和提供小的概括错误。