Neural Radiance Fields (NeRF) have recently gained a surge of interest within the computer vision community for its power to synthesize photorealistic novel views of real-world scenes. One limitation of NeRF, however, is its requirement of accurate camera poses to learn the scene representations. In this paper, we propose Bundle-Adjusting Neural Radiance Fields (BARF) for training NeRF from imperfect (or even unknown) camera poses -- the joint problem of learning neural 3D representations and registering camera frames. We establish a theoretical connection to classical image alignment and show that coarse-to-fine registration is also applicable to NeRF. Furthermore, we show that na\"ively applying positional encoding in NeRF has a negative impact on registration with a synthesis-based objective. Experiments on synthetic and real-world data show that BARF can effectively optimize the neural scene representations and resolve large camera pose misalignment at the same time. This enables view synthesis and localization of video sequences from unknown camera poses, opening up new avenues for visual localization systems (e.g. SLAM) and potential applications for dense 3D mapping and reconstruction.
翻译:最近,在计算机视觉界中,神经辐射场(NERF)对计算机视觉界对合成真实世界场景的摄影现实新观点的能力产生了浓厚的兴趣。 NERF的一个局限性是需要精确的照相机来了解场景。在本文中,我们提议用捆绑调整神经辐射场(BARF)来从不完善(甚至未知)的相机中培训NERF,这是学习神经3D表现和登记相机框架的共同问题。我们建立了与古典图像对齐的理论联系,并表明粗体到软体的登记也适用于NERF。此外,我们表明在NERF中自动应用定位编码对基于合成目的的登记有负面影响。合成和现实世界数据的实验表明,BARF可以有效地优化神经场表现并解决大型相机的错配,同时造成对未知相机配置的视频序列进行视觉合成和本地化。这为视觉本地化系统(e.g.SLM)和密度3D制图及重建的潜在应用开辟了新的途径。