Language has always been one of humanity's defining characteristics. Visual Language Identification (VLI) is a relatively new field of research that is complex and largely understudied. In this paper, we present a preliminary study in which we use linguistic information as a soft biometric trait to enhance the performance of a visual (auditory-free) identification system based on lip movement. We report a significant improvement in the identification performance of the proposed visual system as a result of the integration of these data using a score-based fusion strategy. Methods of Deep and Machine Learning are considered and evaluated. To the experimentation purposes, the dataset called laBial Articulation for the proBlem of the spokEn Language rEcognition (BABELE), consisting of eight different languages, has been created. It includes a collection of different features of which the spoken language represents the most relevant, while each sample is also manually labelled with gender and age of the subjects.
翻译:视觉语言识别(VLI)是一个相对较新的研究领域,比较复杂,而且基本上研究不足。在本论文中,我们提出了一项初步研究,我们利用语言信息作为软生物鉴别特征,用嘴唇运动来提高视觉(不看书)识别系统的性能。我们报告说,由于采用基于分数的融合战略整合这些数据,拟议视觉系统的识别性能有了显著改善。考虑和评估深层和机器学习方法。为了实验目的,创建了由八种不同语言组成的SpokEn 语言(BABELE)ProBlem(BABELE)数据集,该数据集由八种不同语言组成,包括一个不同特征集,语言代表最相关的语言,而每个样本也用手动标有主题的性别和年龄。</s>