Depth estimation from light field (LF) images is a fundamental step for some applications. Recently, learning-based methods have achieved higher accuracy and efficiency than the traditional methods. However, it is costly to obtain sufficient depth labels for supervised training. In this paper, we propose an unsupervised framework to estimate depth from LF images. First, we design a disparity estimation network (DispNet) with a coarse-to-fine structure to predict disparity maps from different view combinations by performing multi-view feature matching to learn the correspondences more effectively. As occlusions may cause the violation of photo-consistency, we design an occlusion prediction network (OccNet) to predict the occlusion maps, which are used as the element-wise weights of photometric loss to solve the occlusion issue and assist the disparity learning. With the disparity maps estimated by multiple input combinations, we propose a disparity fusion strategy based on the estimated errors with effective occlusion handling to obtain the final disparity map. Experimental results demonstrate that our method achieves superior performance on both the dense and sparse LF images, and also has better generalization ability to the real-world LF images.
翻译:从光场(LF)图像中估算深度是某些应用的基本步骤。最近,基于学习的方法比传统方法实现了更高的准确性和效率。然而,要获得足够的深度标签以进行监督培训,成本很高。在本文件中,我们提议了一个不受监督的框架,以估计LF图像的深度。首先,我们设计了一个差异估计网络(DispNet),其结构粗略到纤维结构,以通过进行多视角特征比对以更有效地学习信函来预测不同视图组合的差异地图。由于隔离可能违反照片一致性,我们设计了一个封闭性预测网络(OccNet),以预测隐蔽性地图,这些地图被用作光度损失的元素重量,用以解决隐蔽问题,协助差异学习。由于多种输入组合估计了差异图,我们建议根据估计错误进行差异组合,通过有效的隐蔽处理获得最后差异图。实验结果表明,我们的方法在密度和稀有的LF图像上都具有较高的性能。