We present a parallelized optimization method based on fast Neural Radiance Fields (NeRF) for estimating 6-DoF pose of a camera with respect to an object or scene. Given a single observed RGB image of the target, we can predict the translation and rotation of the camera by minimizing the residual between pixels rendered from a fast NeRF model and pixels in the observed image. We integrate a momentum-based camera extrinsic optimization procedure into Instant Neural Graphics Primitives, a recent exceptionally fast NeRF implementation. By introducing parallel Monte Carlo sampling into the pose estimation task, our method overcomes local minima and improves efficiency in a more extensive search space. We also show the importance of adopting a more robust pixel-based loss function to reduce error. Experiments demonstrate that our method can achieve improved generalization and robustness on both synthetic and real-world benchmarks.
翻译:根据快速神经辐射场(NERF),我们展示了一种平行优化方法,用于估计6-DoF的物体或场景的摄像头形状。如果只观察到目标的 RGB 图像,我们就可以预测摄像头的翻译和旋转,办法是将快速 NERF 模型和被观测图像的像素之间的剩余部分最小化。我们把一种基于动力的相机外部优化程序纳入即时神经图像原始部分,这是最近特别快的 NERF 实施过程。我们的方法通过将平行的Monte Carlo 取样引入组合估测任务,克服了本地的迷你,并在更广泛的搜索空间提高了效率。我们还表明,必须采用一种更强有力的像素损失功能来减少误差。实验表明,我们的方法可以改进合成基准和现实世界基准的普及和稳健性。</s>