Facial expressions are the most common universal forms of body language. In the past few years, automatic facial expression recognition (FER) has been an active field of research. However, it is still a challenging task due to different uncertainties and complications. Nevertheless, efficiency and performance are yet essential aspects for building robust systems. We proposed two models, EmoXNet which is an ensemble learning technique for learning convoluted facial representations, and EmoXNetLite which is a distillation technique that is useful for transferring the knowledge from our ensemble model to an efficient deep neural network using label-smoothen soft labels for able to effectively detect expressions in real-time. Both of the techniques performed quite well, where the ensemble model (EmoXNet) helped to achieve 85.07% test accuracy on FER2013 with FER+ annotations and 86.25% test accuracy on RAF-DB. Moreover, the distilled model (EmoXNetLite) showed 82.07% test accuracy on FER2013 with FER+ annotations and 81.78% test accuracy on RAF-DB. Results show that our models seem to generalize well on new data and are learned to focus on relevant facial representations for expressions recognition.
翻译:显性表达方式是身体语言最常见的通用形式。 在过去几年中, 自动面部表达识别( FER) 是一个活跃的研究领域。 然而, 但由于各种不确定性和复杂因素, 自动面部表达识别( FER) 仍是一项艰巨的任务。 然而, 效率和性能仍然是建设强大系统的基本方面。 我们提出了两种模型: EmoXNet, 这是一种混合学习技术, 用于学习混杂面部表征; EmoXNetLite, 是一种蒸馏技术, 是一种蒸馏技术, 有助于将我们混合模型的知识从我们混合模型中传输到高效的深神经网络, 使用标签- Smoothen软标签, 能够实时有效检测表达。 这两种技术都表现得很好, 共同模型( EmoXNet) 帮助实现了FER+ 的85. 07%测试精度, 以及 RAF-D的86. 25% 测试精度。 此外, 蒸馏模型( EmoXNetLite) 显示, FER+ 的测试精度为82. 07% 测试精度, 在RAF- DB 上测试精确度。 结果显示, 我们的模型似乎显示, 似乎似乎似乎似乎注重于相关数据。