Existing learning-based multi-view stereo (MVS) methods rely on the depth range to build the 3D cost volume and may fail when the range is too large or unreliable. To address this problem, we propose a disparity-based MVS method based on the epipolar disparity flow (E-flow), called DispMVS, which infers the depth information from the pixel movement between two views. The core of DispMVS is to construct a 2D cost volume on the image plane along the epipolar line between each pair (between the reference image and several source images) for pixel matching and fuse uncountable depths triangulated from each pair by multi-view geometry to ensure multi-view consistency. To be robust, DispMVS starts from a randomly initialized depth map and iteratively refines the depth map with the help of the coarse-to-fine strategy. Experiments on DTUMVS and Tanks\&Temple datasets show that DispMVS is not sensitive to the depth range and achieves state-of-the-art results with lower GPU memory.
翻译:现有基于学习的多视图立体器(MVS)方法依赖深度范围来构建 3D 成本量, 当范围太大或不可靠时可能会失败。 为了解决这个问题, 我们提议了一种基于上极差异流( E- 流) 的基于差异的 MVS 方法, 称为 DepMVS, 它从两个视图之间的像素移动中推断出深度信息。 DepMVS 的核心是要在每对相对( 参考图像和若干源图像之间) 的相向平面上沿着相向线( 相向图像和多个源图像之间) 在相向平面上构建一个 2D 成本量, 以像素匹配和连接每对对对以多视图几何测量方式从双对取出无法测量的深度以确保多视图一致性。 要稳健, DepMVS 从随机初始的深度图开始, 并在粗微光谱战略的帮助下迭接地改进深度地图。 对 DTUMVSS 和 Tanks ⁇ Templles 数据集的实验显示, DeppMVS 对深度范围不敏感, 并且以低GPUPU记忆取得状态结果。