In this paper, we address the problem of camera pose estimation in outdoor and indoor scenarios. In comparison to the currently top-performing methods that rely on 2D to 3D matching, we propose a model that can directly regress the camera pose from images with significantly higher accuracy than existing methods of the same class. We first analyse why regression methods are still behind the state-of-the-art, and we bridge the performance gap with our new approach. Specifically, we propose a way to overcome the biased training data by a novel training technique, which generates poses guided by a probability distribution from the training set for synthesising new training views. Lastly, we evaluate our approach on two widely used benchmarks and show that it achieves significantly improved performance compared to prior regression-based methods, retrieval techniques as well as 3D pipelines with local feature matching.
翻译:在本文中,我们解决了相机在室外和室内情景下的估计问题。与目前依赖 2D 和 3D 匹配的顶级性能方法相比,我们提出了一个模型,可以直接从精确度大大高于同一类现有方法的图像中反向摄像头的图像。我们首先分析回归方法为何仍然落后于最新技术,我们用新的方法弥补绩效差距。具体地说,我们提出了一个方法,通过一种新型培训技术来克服有偏差的培训数据,这种技术在综合新培训观点的培训集的概率分布指导下产生。最后,我们评估了我们关于两种广泛使用的基准的方法,并表明与以前基于回归的方法、检索技术以及具有本地特征匹配的3D 管道相比,其性能显著改善。