High-fidelity 3D scene reconstruction from monocular videos continues to be challenging, especially for complete and fine-grained geometry reconstruction. The previous 3D reconstruction approaches with neural implicit representations have shown a promising ability for complete scene reconstruction, while their results are often over-smooth and lack enough geometric details. This paper introduces a novel neural implicit scene representation with volume rendering for high-fidelity online 3D scene reconstruction from monocular videos. For fine-grained reconstruction, our key insight is to incorporate geometric priors into both the neural implicit scene representation and neural volume rendering, thus leading to an effective geometry learning mechanism based on volume rendering optimization. Benefiting from this, we present MonoNeuralFusion to perform the online neural 3D reconstruction from monocular videos, by which the 3D scene geometry is efficiently generated and optimized during the on-the-fly 3D monocular scanning. The extensive comparisons with state-of-the-art approaches show that our MonoNeuralFusion consistently generates much better complete and fine-grained reconstruction results, both quantitatively and qualitatively.
翻译:以单向视频进行高度不菲 3D 场景重建仍然具有挑战性,特别是在完整和精细的几何结构重建方面。 以前的三维重建方法以神经内含表示方式显示,完全进行场面重建的能力大有希望,而其结果往往过于悬浮,且缺乏足够的几何细节。 本文介绍了一个新的神经内含场景图示,其体积通过单向视频进行高度不菲的在线三维场景重建。 对于细微的重建,我们的关键洞察是将几何前纳入神经内隐表和神经体积的形成,从而形成基于体积优化的有效几何学机制。 我们从这个角度介绍单向线上进行三维线重建的单向神经图案,3D 场几何测量法在飞行3D 3D 单向外扫描期间高效生成和优化。 与最新技术方法的广泛比较表明,我们的单向神经Fusion 持续产生更完整和精细的重建结果。