In this paper, at first, the impact of ImageNet pre-training on fine-grained Facial Emotion Recognition (FER) is investigated which shows that when enough augmentations on images are applied, training from scratch provides better result than fine-tuning on ImageNet pre-training. Next, we propose a method to improve fine-grained and in-the-wild FER, called Hybrid Multi-Task Learning (HMTL). HMTL uses Self-Supervised Learning (SSL) as an auxiliary task during classical Supervised Learning (SL) in the form of Multi-Task Learning (MTL). Leveraging SSL during training can gain additional information from images for the primary fine-grained SL task. We investigate how proposed HMTL can be used in the FER domain by designing two customized version of common pre-text task techniques, puzzling and in-painting. We achieve state-of-the-art results on the AffectNet benchmark via two types of HMTL, without utilizing pre-training on additional data. Experimental results on the common SSL pre-training and proposed HMTL demonstrate the difference and superiority of our work. However, HMTL is not only limited to FER domain. Experiments on two types of fine-grained facial tasks, i.e., head pose estimation and gender recognition, reveals the potential of using HMTL to improve fine-grained facial representation.
翻译:本文首先探讨了图像网预培训对微微重增量图像识别(FER)的影响,其结果表明,如果对图像进行足够的增强,从零到零的培训比对图像网预培训的微调更能提供更好的结果。接下来,我们提出改进微重增量和在离散的FER(称为混合多任务学习(HMTL))的方法。HMTL使用以多任务学习(MTL)为形式的自我抽样学习(SS)作为古典监督学习(SL)期间的辅助任务。在培训期间利用SSLSL从图像获得更多信息,以完成精细微的SL任务。我们研究的是,如何通过设计两个定制版本的通用的文本前任务技术(即混合多任务学习(HMTL),即混合学习(HMTL)学习(HMTL ) 。我们通过两种类型的HMTL实现对AfectNet基准的最新结果,而无需在额外数据上进行预先培训。