In this paper, we introduce a deep multi-view stereo (MVS) system that jointly predicts depths, surface normals and per-view confidence maps. The key to our approach is a novel solver that iteratively solves for per-view depth map and normal map by optimizing an energy potential based on the locally planar assumption. Specifically, the algorithm updates depth map by propagating from neighboring pixels with slanted planes, and updates normal map with local probabilistic plane fitting. Both two steps are monitored by a customized confidence map. This solver is not only effective as a post-processing tool for plane-based depth refinement and completion, but also differentiable such that it can be efficiently integrated into deep learning pipelines. Our multi-view stereo system employs multiple optimization steps of the solver over the initial prediction of depths and surface normals. The whole system can be trained end-to-end, decoupling the challenging problem of matching pixels within poorly textured regions from the cost-volume based neural network. Experimental results on ScanNet and RGB-D Scenes V2 demonstrate state-of-the-art performance of the proposed deep MVS system on multi-view depth estimation, with our proposed solver consistently improving the depth quality over both conventional and deep learning based MVS pipelines. Code is available at https://github.com/thuzhaowang/idn-solver.
翻译:在本文中,我们引入了一个深度多视图立体(MVS)系统, 共同预测深度、 表面正常度和每视图信任度地图。 我们方法的关键在于一个新颖的解决方案,它通过优化基于本地平面假设的能源潜能,反复解决每视图深度地图和正常地图的能量潜力。 具体地说, 算法通过通过使用倾斜飞机的相邻像素传播来更新深度地图, 并用本地概率平面设计来更新普通地图。 两个步骤都由定制的信任图来监测。 这个解决方案不仅作为基于飞机的深度改进和完成的后处理工具有效, 而且还有不同之处, 从而能够有效地将其融入深层学习管道。 我们的多视图立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立, 以持续进行常规深度的深度评估。 IM定立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立立度,以高度度度,以展示度,以展示深度深深深深深深深深底,以展示度深深深深深底,以展示高度,以展示高度度度度度度度深深深深底,以展示度度度度度度度度度度度深深深深深深深深底。