Semi-Supervised Learning (SSL) has recently accomplished successful achievements in various fields such as image classification, object detection, and semantic segmentation, which typically require a lot of labour to construct ground-truth. Especially in the depth estimation task, annotating training data is very costly and time-consuming, and thus recent SSL regime seems an attractive solution. In this paper, for the first time, we introduce a novel framework for semi-supervised learning of monocular depth estimation networks, using consistency regularization to mitigate the reliance on large ground-truth depth data. We propose a novel data augmentation approach, called K-way disjoint masking, which allows the network for learning how to reconstruct invisible regions so that the model not only becomes robust to perturbations but also generates globally consistent output depth maps. Experiments on the KITTI and NYU-Depth-v2 datasets demonstrate the effectiveness of each component in our pipeline, robustness to the use of fewer and fewer annotated images, and superior results compared to other state-of-the-art, semi-supervised methods for monocular depth estimation. Our code is available at https://github.com/KU-CVLAB/MaskingDepth.
翻译:最近,在图像分类、物体探测和语义分解等不同领域取得了成功成就,例如图像分类、物体探测和语义分解,这通常需要大量人力才能建立地面真相。特别是在深度估算任务中,说明培训数据非常昂贵,耗时费时,因此最近的SSL制度似乎是一个有吸引力的解决办法。在本文件中,我们首次引入了一个半监督的单层深度估算网络的半监督学习新框架,利用一致性正规化来减轻对大地面真相深度数据的依赖。我们提出了一种新型的数据增强方法,称为Kway脱节掩码,使网络能够学习如何重建隐形区域,使模型不仅对扰动变得坚固,而且产生全球一致的输出深度图。关于KITTI和NYU-Dept-v2数据集的实验显示了我们管道中每个组成部分的效能,对使用注释图像的强度越来越强,以及与其它状态、半监督U-Mast-Mac 深度估算方法相比结果优异。我们的代码可在 http-abs/Deplibs.