Facial expressions play a fundamental role in human communication. Indeed, they typically reveal the real emotional status of people beyond the spoken language. Moreover, the comprehension of human affect based on visual patterns is a key ingredient for any human-machine interaction system and, for such reasons, the task of Facial Expression Recognition (FER) draws both scientific and industrial interest. In the recent years, Deep Learning techniques reached very high performance on FER by exploiting different architectures and learning paradigms. In such a context, we propose a multi-resolution approach to solve the FER task. We ground our intuition on the observation that often faces images are acquired at different resolutions. Thus, directly considering such property while training a model can help achieve higher performance on recognizing facial expressions. To our aim, we use a ResNet-like architecture, equipped with Squeeze-and-Excitation blocks, trained on the Affect-in-the-Wild 2 dataset. Not being available a test set, we conduct tests and models selection by employing the validation set only on which we achieve more than 90\% accuracy on classifying the seven expressions that the dataset comprises.
翻译:法西斯表达方式在人类交流中起着根本作用。 事实上,它们通常揭示人们在口头语言之外的真实情感状态。 此外,根据视觉模式理解人类影响是任何人类机器互动系统的一个关键要素,因此,法西斯表达识别(FER)的任务具有科学和工业两方面的兴趣。近年来,深学习技术通过利用不同的架构和学习模式在FER上取得了极高的成绩。在这样的背景下,我们提出一种解决FER任务的多分辨率方法。我们把我们的直觉建立在常常面对图像的观测结果上,是在不同分辨率上取得的。因此,在直接考虑这种属性的同时,培训一个模型可以帮助在识别面部表达方式上取得更高的性能。为了我们的目标,我们使用一个配备了Squeze-and-Expence区的ResNet类似结构,在Affect-in-the-Wild 2数据集方面受过培训。我们没有一套测试,我们只使用鉴定组来进行测试和模型选择,我们只使用它来对数据集进行超过90 ⁇ 的分类。