Facial micro-expressions (MEs) are involuntary facial motions revealing peoples real feelings and play an important role in the early intervention of mental illness, the national security, and many human-computer interaction systems. However, existing micro-expression datasets are limited and usually pose some challenges for training good classifiers. To model the subtle facial muscle motions, we propose a robust micro-expression recognition (MER) framework, namely muscle motion-guided network (MMNet). Specifically, a continuous attention (CA) block is introduced to focus on modeling local subtle muscle motion patterns with little identity information, which is different from most previous methods that directly extract features from complete video frames with much identity information. Besides, we design a position calibration (PC) module based on the vision transformer. By adding the position embeddings of the face generated by PC module at the end of the two branches, the PC module can help to add position information to facial muscle motion pattern features for the MER. Extensive experiments on three public micro-expression datasets demonstrate that our approach outperforms state-of-the-art methods by a large margin.
翻译:显微表情是非自愿的面部运动,它揭示了人们的真实感情,在早期干预精神疾病、国家安全和许多人-计算机互动系统中发挥了重要作用。然而,现有的微表情数据集有限,通常对良好分类人员的培训构成一些挑战。为模拟微妙的面部肌肉动作,我们提议一个强大的微表情识别框架,即肌肉运动引导网络(MMNet)。具体地说,持续关注(CA)块的重点是以很少身份信息的当地微弱肌肉运动模式建模,这与以前最先直接从带有许多身份信息的完整视频框中提取特征的方法不同。此外,我们还根据视觉变异器设计了一个定位校准模块。通过在两分支的末尾添加个人计算机模块所生成的面部位置嵌入,个人计算机模块可以帮助将定位信息添加到面部肌肉运动模式特征中。关于三个公共微表情数据集的大规模实验表明,我们的方法在很大的空间上超越了状态方法。