The cost of Head View point labels is the main hurdle in the improving of fine-grained Head Pose estimation algorithm. One solution to the lack of huge number of labels is using Self-Supervised Learning (SSL). SSL can extract good features from unlabeled data for a downstream task. Accordingly, this article has tried to answer a question: How Self-Supervised Learning (SSL) can be used for Head Pose estimation? In general, there are two main approaches to use SSL: (1) Using it to pre-train the weights, (2) Leveraging SSL as an auxiliary task besides of Supervised Learning (SL) in one training session. In this study, we compared two approaches by designing a Hybrid Multi-Task Learning (HMTL) architecture and assessing it with two SSL pre-text tasks, the rotation and puzzling. Results showed that the combination of both methods in which using rotation for pre-training and using puzzling for auxiliary head were the best. Together, the error rate was reduced up to 13% compared to the baseline which is comparable with current SOTA methods. Finally, we compared the impact of initial weights on the HMTL and SL. Subsequently, by HMTL, the error was reduced with all kinds of initial weights: random, ImageNet and SSL.
翻译:头视图点标签的成本是改进精细头鼠标估计算法的主要障碍。 缺少大量标签的一个解决办法是使用自上式学习(SSL) 。 SSL可以从未贴标签的数据中为下游任务提取良好的特征。 因此, 本文试图回答一个问题: 如何用自上式学习(SSL) 来进行头部估计? 总的来说, 使用 SSL 的主要方法有两个主要:(1) 使用它来对重量进行预培训, (2) 在一次培训课中将 SSL 作为一种辅助任务, 缺乏大量标签的一个解决办法就是使用自上式学习(SSL ) 。 在这次研究中,我们比较了两种方法, 设计了混合多塔斯克学习(HMTL) 结构, 并用两个 SSL 预文本任务(SSL ) 进行评估。 结果显示, 使用预培训的轮换和辅助头部的模糊两种方法的结合是最佳方法。 共同, 我们的错误率降到了13%, 比起先期学习(SL) 基线还要13, 而后期(HTA) 的重量和随机的SMTL) 方法都比较了SMTL 。