There is a growing interest in developing computer vision methods that can learn from limited supervision. In this paper, we consider the problem of learning to predict camera viewpoints, where obtaining ground-truth annotations are expensive and require special equipment, from a limited number of labeled images. We propose a semi-supervised viewpoint estimation method that can learn to infer viewpoint information from unlabeled image pairs, where two images differ by a viewpoint change. In particular our method learns to synthesize the second image by combining the appearance from the first one and viewpoint from the second one. We demonstrate that our method significantly improves the supervised techniques, especially in the low-label regime and outperforms the state-of-the-art semi-supervised methods.
翻译:人们越来越关心开发可以从有限监督中学习的计算机视觉方法。 在本文中,我们考虑到学习预测摄像师视角的问题,在这些视角中,从数量有限的贴标签图像中获取地面实况说明费用昂贵,需要特殊设备。我们提出了一个半监督观点估算方法,可以学习从未贴标签的图像配对中推断观点信息,其中两种图像因观点变化而不同。特别是,我们的方法通过将第一图像的外观与第二图像的外观结合起来,学习合成第二图像。我们证明,我们的方法极大地改进了监督技术,特别是在低标签制度方面,并超越了最先进的半监督方法。