In deep multi-view stereo networks, cost regularization is crucial to achieve accurate depth estimation. Since 3D cost volume filtering is usually memory-consuming, recurrent 2D cost map regularization has recently become popular and has shown great potential in reconstructing 3D models of different scales. However, existing recurrent methods only model the local dependencies in the depth domain, which greatly limits the capability of capturing the global scene context along the depth dimension. To tackle this limitation, we propose a novel non-local recurrent regularization network for multi-view stereo, named NR2-Net. Specifically, we design a depth attention module to capture non-local depth interactions within a sliding depth block. Then, the global scene context between different blocks is modeled in a gated recurrent manner. This way, the long-range dependencies along the depth dimension are captured to facilitate the cost regularization. Moreover, we design a dynamic depth map fusion strategy to improve the algorithm robustness. Our method achieves state-of-the-art reconstruction results on both DTU and Tanks and Temples datasets.
翻译:在深层多视图立体声网络中,成本正规化对于实现准确的深度估计至关重要。由于3D成本量过滤通常耗时记忆,经常的2D成本地图正规化最近变得很受欢迎,在重建不同比例的3D模型方面显示出巨大的潜力。然而,现有的经常性方法只建模深度域的本地依赖性,这极大地限制了在深度维度上捕捉全球场景的能力。为了应对这一限制,我们提议为多视图立体声设计一个新的非本地经常性常规正规化网络,名为NR2-Net。具体地说,我们设计了一个深度关注模块,以在滑动深度区块内捕捉非本地深度互动。然后,不同区块之间的全球场景以封闭的经常性方式建模。这样,深度维度的长距离依赖性就能够促进成本正规化。此外,我们设计了一个动态深度集成的地图组合战略,以提高算法的稳健性。我们的方法在DTU、Tanks和Temples数据集上都取得了最新的重建结果。