The cost of head pose labeling is the main challenge of improving the fine-grained Head Pose Estimation (HPE). Although Self-Supervised Learning (SSL) can be a solution to the lack of huge amounts of labeled data, its efficacy for fine-grained HPE is not yet fully explored. This study aims to assess the usage of SSL in fine-grained HPE based on two scenarios: (1) using SSL for weights pre-training procedure, and (2) leveraging auxiliary SSL losses besides HPE. We design a Hybrid Multi-Task Learning (HMTL) architecture based on the ResNet50 backbone in which both strategies are applied. Our experimental results reveal that the combination of both scenarios is the best for HPE. Together, the average error rate is reduced up to 23.1% for AFLW2000 and 14.2% for BIWI benchmark compared to the baseline. Moreover, it is found that some SSL methods are more suitable for transfer learning, while others may be effective when they are considered as auxiliary tasks incorporated into supervised learning. Finally, it is shown that by using the proposed HMTL architecture, the average error is reduced with different types of initial weights: random, ImageNet and SSL pre-trained weights.
翻译:头部姿势标签的成本是改进细制头部测谎(HPE)的主要挑战。虽然自我强化学习(SSL)可以解决缺乏大量标签数据的问题,但对于细制人身防护设备的效果尚未充分探讨。这项研究的目的是根据两种假设评估SSL在细制人身防护设备中的使用率:(1) 使用SSL在培训前的重量程序上,(2) 利用HPE以外的辅助性SSL损失。我们设计了一个基于ResNet50主干网的混合多任务学习(HMTL)架构,这两种战略都应用其中。我们的实验结果表明,这两种假设的结合对HPE都是最好的。加在一起,平均误差率比基准降低到23.1%,比基准低14.2%。此外,还发现一些SLSL方法更适合转移学习,而另一些方法在被视为辅助性任务纳入监督性学习时可能有效。最后,我们通过使用拟议的HMTL模型的初始重量和随机体重结构显示:使用HMTL标准前的平均重量为低。