In this work, we present a dense tracking and mapping system named Vox-Fusion, which seamlessly fuses neural implicit representations with traditional volumetric fusion methods. Our approach is inspired by the recently developed implicit mapping and positioning system and further extends the idea so that it can be freely applied to practical scenarios. Specifically, we leverage a voxel-based neural implicit surface representation to encode and optimize the scene inside each voxel. Furthermore, we adopt an octree-based structure to divide the scene and support dynamic expansion, enabling our system to track and map arbitrary scenes without knowing the environment like in previous works. Moreover, we proposed a high-performance multi-process framework to speed up the method, thus supporting some applications that require real-time performance. The evaluation results show that our methods can achieve better accuracy and completeness than previous methods. We also show that our Vox-Fusion can be used in augmented reality and virtual reality applications. Our source code is publicly available at https://github.com/zju3dv/Vox-Fusion.
翻译:在这项工作中,我们提出了一个名为Vox-Fusion的密集的跟踪和绘图系统,它无缝地将神经隐含的表示与传统的体积聚合方法结合在一起。我们的方法受到最近开发的隐含绘图和定位系统的启发,进一步扩展了这一想法,以便可以自由地应用于实际的情景。具体地说,我们利用基于 voxel 的神经隐含表层来编码和优化每个 voxel 体内的场景。此外,我们采用一种基于octree 的结构来分割场景并支持动态扩展,使我们的系统能够在不了解以前工作的环境的情况下跟踪和绘制任意的场景。此外,我们提出了一个高性能的多过程框架来加速该方法,从而支持一些需要实时性能的应用。评价结果表明,我们的方法可以比以前的方法更准确和完整。我们还表明,我们的Vox-Fusion可用于增强现实和虚拟现实应用。我们的源代码可在https://github.com/zju3dv/Vox-Fusion上公开查阅。我们的来源代码可在https://giths://github.com/zju3dv/v/Vox-fus-fusion查阅。