Emotions play a central role in the social life of every human being, and their study, which represents a multidisciplinary subject, embraces a great variety of research fields. Especially concerning the latter, the analysis of facial expressions represents a very active research area due to its relevance to human-computer interaction applications. In such a context, Facial Expression Recognition (FER) is the task of recognizing expressions on human faces. Typically, face images are acquired by cameras that have, by nature, different characteristics, such as the output resolution. It has been already shown in the literature that Deep Learning models applied to face recognition experience a degradation in their performance when tested against multi-resolution scenarios. Since the FER task involves analyzing face images that can be acquired with heterogeneous sources, thus involving images with different quality, it is plausible to expect that resolution plays an important role in such a case too. Stemming from such a hypothesis, we prove the benefits of multi-resolution training for models tasked with recognizing facial expressions. Hence, we propose a two-step learning procedure, named MAFER, to train DCNNs to empower them to generate robust predictions across a wide range of resolutions. A relevant feature of MAFER is that it is task-agnostic, i.e., it can be used complementarily to other objective-related techniques. To assess the effectiveness of the proposed approach, we performed an extensive experimental campaign on publicly available datasets: \fer{}, \raf{}, and \oulu{}. For a multi-resolution context, we observe that with our approach, learning models improve upon the current SotA while reporting comparable results in fix-resolution contexts. Finally, we analyze the performance of our models and observe the higher discrimination power of deep features generated from them.
翻译:情感在每个人的社会生活中发挥着核心作用,他们的研究是一个多学科主题,它包含许多不同的研究领域。特别是对于后者,面部表达的分析是一个非常积极的研究领域,因为它与人体计算机互动应用有关。在这样的背景下,面部表现识别(FER)是承认人脸表情的任务。通常,脸部图像是由摄影机获得的,这些摄影机具有不同的性质特征,例如产出解析。文献已经显示,深层学习模型用于面对认知,在对照多分辨率假设进行测试时,其性能会退化。由于FER的任务涉及分析可以通过多种来源获得的面部图像,从而涉及不同质量的图像,因此有理由期待决议在这样的情况下也起到重要作用。根据这种假设,我们证明多分辨率培训负责识别面部表现的模型的好处。因此,我们提议了一种双步学习程序,名为MAFER,在DCNS上进行广泛的培训,以赋予它们能力,在各种分辨率情景下产生更强的预测。我们用不同质量的图像分析,最后使用MAGER的功能是公开评估我们当前运动的一个相关的结果。