Human emotions involve basic and compound facial expressions. However, current research on facial expression recognition (FER) mainly focuses on basic expressions, and thus fails to address the diversity of human emotions in practical scenarios. Meanwhile, existing work on compound FER relies heavily on abundant labeled compound expression training data, which are often laboriously collected under the professional instruction of psychology. In this paper, we study compound FER in the cross-domain few-shot learning setting, where only a few images of novel classes from the target domain are required as a reference. In particular, we aim to identify unseen compound expressions with the model trained on easily accessible basic expression datasets. To alleviate the problem of limited base classes in our FER task, we propose a novel Emotion Guided Similarity Network (EGS-Net), consisting of an emotion branch and a similarity branch, based on a two-stage learning framework. Specifically, in the first stage, the similarity branch is jointly trained with the emotion branch in a multi-task fashion. With the regularization of the emotion branch, we prevent the similarity branch from overfitting to sampled base classes that are highly overlapped across different episodes. In the second stage, the emotion branch and the similarity branch play a "two-student game" to alternately learn from each other, thereby further improving the inference ability of the similarity branch on unseen compound expressions. Experimental results on both in-the-lab and in-the-wild compound expression datasets demonstrate the superiority of our proposed method against several state-of-the-art methods.
翻译:人类情绪涉及基本的和复合的面部表情表情表达方式。 然而,目前关于面部表情识别(FER)的研究主要侧重于基本表达方式,因此无法在实际情景中解决人类情感的多样性问题。 同时,关于复合FER的现有工作大量依赖大量贴标签的复合表情培训数据,这些数据往往是在心理学专业指导下辛苦收集的。在本文中,我们研究交叉面部少片学习环境中的复合FER,其中只需要目标领域新类的少量图像作为参考。特别是,我们的目标是通过在容易获得的基本表达方式上培训的模型来识别看不见的复合表达方式。为了缓解我们FER任务中有限的基类问题,我们建议建立一个新型情感指导相似性网络(EGS-Net),由情感分支和一个类似的分支组成,以两阶段学习框架为基础。具体地说,在第一阶段,相似的分支以多任务方式与情感分支共同培训。随着情感分支的正常化,我们防止相似的分支过分适应抽样基础课程,而这种模式在不同阶段之间高度重叠。 在游戏的第二阶段, 类似情感分支的学习另一个分支- 更进一步演化。