We propose to personalize a human pose estimator given a set of test images of a person without using any manual annotations. While there is a significant advancement in human pose estimation, it is still very challenging for a model to generalize to different unknown environments and unseen persons. Instead of using a fixed model for every test case, we adapt our pose estimator during test time to exploit person-specific information. We first train our model on diverse data with both a supervised and a self-supervised pose estimation objectives jointly. We use a Transformer model to build a transformation between the self-supervised keypoints and the supervised keypoints. During test time, we personalize and adapt our model by fine-tuning with the self-supervised objective. The pose is then improved by transforming the updated self-supervised keypoints. We experiment with multiple datasets and show significant improvements on pose estimations with our self-supervised personalization.
翻译:我们提议在不使用任何手动说明的情况下,将一个人的姿势估计值个人化,给一个人的一组测试图像。虽然人姿估计值有显著进步,但是要将模型推广到不同的未知环境和隐形人,仍然非常困难。我们没有在每次测试中使用固定模型,而是在测试期间调整我们的姿势估计值,以便利用特定个人的信息。我们首先用一个受监督的和自我监督的姿势估计目标来培训我们关于不同数据的模型。我们使用一个变形模型来在自我监督的基点和受监督的基点之间进行转换。在测试期间,我们通过与自我监督的目标进行微调,实现个性化和调整我们的模型。然后,通过改造更新的自我监督关键点来改进其姿势。我们用多套数据集进行实验,并以自我监督的个人化来显示对作出估计的显著改进。