Facial expressions are the most universal forms of body language and automatic facial expression recognition is one of the challenging tasks due to different uncertainties. However, it has been an active field of research for many years. Nevertheless, efficiency and performance are yet essential aspects for building robust systems. We proposed two models, EmoXNet which is an ensemble learning technique for learning convoluted facial representations, and EmoXNetLite which is a distillation technique that is useful for transferring the knowledge from our ensemble model to an efficient deep neural network using label-smoothen soft labels for able to effectively detect expressions in real-time. Both of the techniques performed quite well, where the ensemble model (EmoXNet) helped to achieve 85.07% test accuracy on FER2013 with FER+ annotations and 86.25% test accuracy on RAF-DB. Moreover, the distilled model (EmoXNetLite) showed 82.07% test accuracy on FER2013 with FER+ annotations and 81.78% test accuracy on RAF-DB.
翻译:显性表达式是身体语言最普遍的形式,而面部自动面部表达识别是不同不确定因素造成的一项具有挑战性的任务。然而,多年来,它一直是一项积极的研究领域。然而,效率和性能仍然是建设强健系统必不可少的方面。我们提出了两种模型:EmoXNet,这是学习复杂面部表情的混合学习技术;EmoXNetLite,是一种蒸馏技术,有助于将我们混合模型的知识传输到高效的深神经网络,使用标签-smoothen软标签,能够实时有效检测表情。这两种技术都表现得很好,在其中,联合模型(EmoXNet)帮助实现了FER201385.07%的测试精度,并配有FER+说明和RAF-DB的86.25%测试精度。此外,蒸馏模型(EmoXNetLite)用FER+图解和RAF-DDB的81.78%测试精度,显示FER2013试验精度为82.07%。