The paper describes our proposed methodology for the six basic expression classification track of Affective Behavior Analysis in-the-wild (ABAW) Competition 2022. In Learing from Synthetic Data(LSD) task, facial expression recognition (FER) methods aim to learn the representation of expression from the artificially generated data and generalise to real data. Because of the ambiguous of the synthetic data and the objectivity of the facial Action Unit (AU), we resort to the AU information for performance boosting, and make contributions as follows. First, to adapt the model to synthetic scenarios, we use the knowledge from pre-trained large-scale face recognition data. Second, we propose a conceptually-new framework, termed as AU-Supervised Convolutional Vision Transformers (AU-CVT), which clearly improves the performance of FER by jointly training auxiliary datasets with AU or pseudo AU labels. Our AU-CVT achieved F1 score as $0.6863$, accuracy as $0.7433$ on the validation set. The source code of our work is publicly available online: https://github.com/msy1412/ABAW4
翻译:本文介绍了我们为2022年合成数据(LSD)竞赛的六种基本表达式分类轨迹的拟议方法。在从合成数据(LSD)任务中泄漏知识时,面部表达识别(FER)方法旨在从人工生成的数据中学习表达的表达方式,并通向真实数据。由于合成数据和面部行动股(AU)的客观性含混不清,我们利用非盟的信息提高绩效,并做出如下贡献。首先,为了将模型适应合成情景,我们使用了预先培训的大型脸部识别数据的知识。第二,我们提出了一个概念上的新框架,称为AU-超超革命性共同愿景变异器(AU-CVT),该框架通过联合培训非盟或假非盟标签的辅助数据集,明显改善了FER的性能。我们的A-CVT成绩为0.6863美元,验证集的准确度为0.7433美元。我们工作的源代码可在网上公开查阅:https://github.com/msymy1412/ABAWAWAWA4。