Recognition of facial expression is a challenge when it comes to computer vision. The primary reasons are class imbalance due to data collection and uncertainty due to inherent noise such as fuzzy facial expressions and inconsistent labels. However, current research has focused either on the problem of class imbalance or on the problem of uncertainty, ignoring the intersection of how to address these two problems. Therefore, in this paper, we propose a framework based on Resnet and Attention to solve the above problems. We design weight for each class. Through the penalty mechanism, our model will pay more attention to the learning of small samples during training, and the resulting decrease in model accuracy can be improved by a Convolutional Block Attention Module (CBAM). Meanwhile, our backbone network will also learn an uncertain feature for each sample. By mixing uncertain features between samples, the model can better learn those features that can be used for classification, thus suppressing uncertainty. Experiments show that our method surpasses most basic methods in terms of accuracy on facial expression data sets (e.g., AffectNet, RAF-DB), and it also solves the problem of class imbalance well.
翻译:在计算机视觉方面,对面部表达的认知是一项挑战。主要的原因是,由于诸如模糊面部表情和不一致标签等内在噪音造成的数据收集和不确定性造成的阶级不平衡和不确定性。然而,目前研究的重点要么是阶级不平衡问题,要么是不确定性问题,忽视了如何解决这两个问题的交叉点。因此,在本文件中,我们提出了一个基于Resnet和注意力解决上述问题的框架。我们为每个阶级设计了体重。我们的模式将通过惩罚机制,更多地关注在培训期间对小样本的学习,由此导致模型准确性下降的情况可以通过一个卷积体注意模块(CBAM)来改进。与此同时,我们的骨干网络也将为每个样本学习一个不确定的特征。通过混合样本之间的不确定特征,模型可以更好地了解可用于分类的特征,从而抑制不确定性。实验表明,我们的方法在面部表达数据集(例如,AffectNet,RAF-DB)的准确性方面超过了最基本的方法,这也解决了阶级不平衡问题。