In this work, we present a new multi-view depth estimation method that utilizes both conventional reconstruction and learning-based priors over the recently proposed neural radiance fields (NeRF). Unlike existing neural network based optimization method that relies on estimated correspondences, our method directly optimizes over implicit volumes, eliminating the challenging step of matching pixels in indoor scenes. The key to our approach is to utilize the learning-based priors to guide the optimization process of NeRF. Our system firstly adapts a monocular depth network over the target scene by finetuning on its sparse SfM+MVS reconstruction from COLMAP. Then, we show that the shape-radiance ambiguity of NeRF still exists in indoor environments and propose to address the issue by employing the adapted depth priors to monitor the sampling process of volume rendering. Finally, a per-pixel confidence map acquired by error computation on the rendered image can be used to further improve the depth quality. Experiments show that our proposed framework significantly outperforms state-of-the-art methods on indoor scenes, with surprising findings presented on the effectiveness of correspondence-based optimization and NeRF-based optimization over the adapted depth priors. In addition, we show that the guided optimization scheme does not sacrifice the original synthesis capability of neural radiance fields, improving the rendering quality on both seen and novel views. Code is available at https://github.com/weiyithu/NerfingMVS.
翻译:在这项工作中,我们提出了一个新的多视角深度估算方法,它利用最近提议的神经光亮场(NERF)的常规重建和学习前科,对最近提出的神经光亮场(NERF)进行常规重建和学习前科。与现有以估计通信为基础的神经网络优化方法不同,我们的方法直接优化了隐含数量,消除了室内像素匹配的挑战性步骤。我们的方法的关键是利用基于学习的前科指导NERF的优化进程。我们的系统首先调整了对目标场面的单方深度网络,对其从COLMAP(COLMAP)进行的稀少的SfM+MVS重建进行了微调。然后,我们展示了NERF在室内环境中的形状辐射模糊性模糊性,并提议通过使用经调整的深度来应对这一问题。最后,我们的方法是利用通过对所提供图像的误算而获得的每平方信任图来指导NERF的优化进程。实验表明,我们提议的框架大大超出了室内场面的状态/艺术方法。我们提出的关于基于通信的优化和NRF的原始优化计划的有效性的附加结论。我们之前所看到的是改进了REBS的深度。