Facial action unit (AU) detection is a challenging task due to the scarcity of manual annotations. Recent works on AU detection with self-supervised learning have emerged to address this problem, aiming to learn meaningful AU representations from numerous unlabeled data. However, most existing AU detection works with self-supervised learning utilize global facial features only, while AU-related properties such as locality and relevance are not fully explored. In this paper, we propose a novel self-supervised framework for AU detection with the region and relation learning. In particular, AU related attention map is utilized to guide the model to focus more on AU-specific regions to enhance the integrity of AU local features. Meanwhile, an improved Optimal Transport (OT) algorithm is introduced to exploit the correlation characteristics among AUs. In addition, Swin Transformer is exploited to model the long-distance dependencies within each AU region during feature learning. The evaluation results on BP4D and DISFA demonstrate that our proposed method is comparable or even superior to the state-of-the-art self-supervised learning methods and supervised AU detection methods.
翻译:由于缺少手动说明,侦察行动股(AU)是一项具有挑战性的任务,因为缺少手动说明。最近关于用自我监督的学习探测非盟的工作已经出现,以解决这一问题,目的是从许多未贴标签的数据中了解非盟有意义的代表性;然而,大多数现有的自我监督的非洲联盟探测工作仅利用全球面部特征,而没有充分探讨与非盟有关的特性,如地点和相关性等。在本文件中,我们提出了一个新的自我监督框架,供非盟与该区域探测和关系学习使用。特别是,利用非盟相关关注地图来指导模型,更加注重非盟特定区域,以加强非盟地方特征的完整性。与此同时,采用了改进的优化运输算法,以利用非盟之间的相关性特征。此外,Swin变异器在特征学习过程中利用了非盟各区域的长距离依赖性模型。BP4D和DISAFA的评价结果表明,我们拟议的方法与最先进的自我监督的非盟检测方法相似甚至更高。</s>