Despite recent improvement of supervised monocular depth estimation, the lack of high quality pixel-wise ground truth annotations has become a major hurdle for further progress. In this work, we propose a new unsupervised depth estimation method based on pseudo supervision mechanism by training a teacher-student network with knowledge distillation. It strategically integrates the advantages of supervised and unsupervised monocular depth estimation, as well as unsupervised binocular depth estimation. Specifically, the teacher network takes advantage of the effectiveness of binocular depth estimation to produce accurate disparity maps, which are then used as the pseudo ground truth to train the student network for monocular depth estimation. This effectively converts the problem of unsupervised learning to supervised learning. Our extensive experimental results demonstrate that the proposed method outperforms the state-of-the-art on the KITTI benchmark.
翻译:尽管监督单眼深度估计最近有所改进,但缺乏高质量的像素实地真相说明已成为取得进一步进展的主要障碍。在这项工作中,我们提出一种新的未经监督的深度估计方法,其基础是培训教师-学生网络,进行知识蒸馏,对教师-学生网络进行培训,以假监督机制为基础;在战略上综合了监督和不受监督的单眼深度估计以及不受监督的双眼深度估计的优点。具体地说,教师网络利用双眼深度估计的有效性,制作准确的差异图,然后用作假地面的地面真象,用于培训学生网络进行单眼深度估计。这有效地将未经监督的学习问题转化为受监督的学习问题。我们广泛的实验结果表明,拟议的方法超过了KITTI基准的先进水平。