Neural field-based 3D representations have recently been adopted in many areas including SLAM systems. Current neural SLAM or online mapping systems lead to impressive results in the presence of simple captures, but they rely on a world-centric map representation as only a single neural field model is used. To define such a world-centric representation, accurate and static prior information about the scene, such as its boundaries and initial camera poses, are required. However, in real-time and on-the-fly scene capture applications, this prior knowledge cannot be assumed as fixed or static, since it dynamically changes and it is subject to significant updates based on run-time observations. Particularly in the context of large-scale mapping, significant camera pose drift is inevitable, necessitating the correction via loop closure. To overcome this limitation, we propose NEWTON, a view-centric mapping method that dynamically constructs neural fields based on run-time observation. In contrast to prior works, our method enables camera pose updates using loop closures and scene boundary updates by representing the scene with multiple neural fields, where each is defined in a local coordinate system of a selected keyframe. The experimental results demonstrate the superior performance of our method over existing world-centric neural field-based SLAM systems, in particular for large-scale scenes subject to camera pose updates.
翻译:神经场三维表示法已经在许多领域中被采用,包括SLAM系统。当前的神经SLAM或在线地图系统在简单的捕捉存在的情况下可以实现令人印象深刻的结果,但是它们依赖于以世界为中心的地图表示,因为只使用了一个神经场模型。为了定义这样的世界为中心的表示,需要关于场景的精确和静态的先验信息,例如它的边界和初始相机姿态。然而,在实时和即时捕捉场景应用中,这种先验知识不能被假定为固定或静态,因为它在运行时动态地改变,并且基于运行时观察会受到重大的更新。特别是在大规模地图制作的背景下,相机姿态漂移是不可避免的,需要通过回路闭合来进行修正。为了克服这个限制,我们提出了NEWTON,一种基于视角的映射方法,它根据运行时观察动态构建神经场。与以前的作品不同,我们的方法使相机姿态更新使用回路闭合并通过在选定关键帧的本地坐标系中表示场景来更新场景边界。实验结果表明,我们的方法在大规模场景中,特别是在相机姿态更新方面,比现有的以世界为中心的神经场SLAM系统表现出更好的性能。