Facial expression recognition (FER) plays a significant role in the ubiquitous application of computer vision. We revisit this problem with a new perspective on whether it can acquire useful representations that improve FER performance in the image generation process, and propose a novel generative method based on the image inversion mechanism for the FER task, termed Inversion FER (IFER). Particularly, we devise a novel Adversarial Style Inversion Transformer (ASIT) towards IFER to comprehensively extract features of generated facial images. In addition, ASIT is equipped with an image inversion discriminator that measures the cosine similarity of semantic features between source and generated images, constrained by a distribution alignment loss. Finally, we introduce a feature modulation module to fuse the structural code and latent codes from ASIT for the subsequent FER work. We extensively evaluate ASIT on facial datasets such as FFHQ and CelebA-HQ, showing that our approach achieves state-of-the-art facial inversion performance. IFER also achieves competitive results in facial expression recognition datasets such as RAF-DB, SFEW and AffectNet. The code and models are available at https://github.com/Talented-Q/IFER-master.
翻译:在计算机视觉的无处不在的应用中,法西表达式识别(FER)在计算机视觉的普遍存在应用中起着重要作用。我们重新审视这一问题,从新角度审视这一问题,看它能否获得有用的表示方式,提高FER在图像生成过程中的性能,并基于FER任务的图像转换机制,即Inversion FER(IFER),提出一种新的归正方法。特别是,我们为IFER设计了一个新型的反转样式转换器(ASIT),以全面提取生成的面部图像的特征。此外,ASIT还配备了一个图像反向识别器,以测量源与生成图像之间相似的语义特征,但受发行量调整损失的制约。最后,我们引入了一个特征调制模块,将ASIT的结构性代码和潜在代码结合到FFFHQ和CelebA-HA-HQ等面部数据集上,显示我们的方法达到了最新艺术面部面部面部面部面部功能的性能。IFERTAFERB/TESM/TERFERM/Q。