Absolute Pose Regression (APR) methods use deep neural networks to directly regress camera poses from RGB images. Despite their advantages in inference speed and simplicity, these methods still fall short of the accuracy achieved by geometry-based techniques. To address this issue, we propose a new model called the Neural Feature Synthesizer (NeFeS). Our approach encodes 3D geometric features during training and renders dense novel view features at test time to refine estimated camera poses from arbitrary APR methods. Unlike previous APR works that require additional unlabeled training data, our method leverages implicit geometric constraints during test time using a robust feature field. To enhance the robustness of our NeFeS network, we introduce a feature fusion module and a progressive training strategy. Our proposed method improves the state-of-the-art single-image APR accuracy by as much as 54.9% on indoor and outdoor benchmark datasets without additional time-consuming unlabeled data training.
翻译:摘要:绝对位姿回归(APR)方法使用深度神经网络直接从RGB图像回归相机位姿。尽管其具有推理速度和简单性的优点,但这些方法的精度仍然不及基于几何的技术所实现的精度。为了解决这个问题,我们提出了一种称为神经特征综合器(NeFeS)的新模型。我们的方法在训练期间对三维几何特征进行编码,并在测试时间呈现稠密的新视角特征,以精化任意 APR 方法的估计相机姿态。与以前需要额外未标记训练数据的 APR 工作不同,我们的方法在测试时间使用鲁棒的特征场利用了隐含的几何约束。为了增强我们的NeFeS网络的鲁棒性,我们引入了特征融合模块和渐进式训练策略。我们的提议方法在室内和室外基准数据集上将单张图像 APR 精度提高了高达 54.9%,而不需要额外费时的未标记数据训练。