Multi-View Stereo (MVS) is a core task in 3D computer vision. With the surge of novel deep learning methods, learned MVS has surpassed the accuracy of classical approaches, but still relies on building a memory intensive dense cost volume. Novel View Synthesis (NVS) is a parallel line of research and has recently seen an increase in popularity with Neural Radiance Field (NeRF) models, which optimize a per scene radiance field. However, NeRF methods do not generalize to novel scenes and are slow to train and test. We propose to bridge the gap between these two methodologies with a novel network that can recover 3D scene geometry as a distance function, together with high-resolution color images. Our method uses only a sparse set of images as input and can generalize well to novel scenes. Additionally, we propose a coarse-to-fine sphere tracing approach in order to significantly increase speed. We show on various datasets that our method reaches comparable accuracy to per-scene optimized methods while being able to generalize and running significantly faster. We provide the source code at https://github.com/AIS-Bonn/neural_mvs
翻译:多视系统是3D计算机视野的核心任务。 随着新型深层次学习方法的激增, 学到的 MVS 已经超过了古典方法的准确性, 但仍依赖于构建一个记忆密集的密集成本体积。 Novel View合成( NVS) 是一个平行的研究线, 最近也看到神经光谱场模型的受欢迎程度有所提高, 这些模型优化了每个场景的亮度场。 然而, NerFS 方法并不概括于新奇场景, 培训和测试速度缓慢 。 我们提议用能够恢复3D 场景几何功能的新型网络以及高分辨率的彩色图像来缩小这两种方法之间的差距。 我们的方法仅使用一套稀有的图像作为输入, 并且可以概括新场景。 此外, 我们提议了一种粗微到光谱的场追踪方法, 以大幅加快速度。 我们展示了各种数据集, 我们的方法在能够比较和运行快得多的速度的同时, 。 我们提供了 https://github. com/ AIS- Bonnne_ pral 。