This paper presents a novel method for depth completion, which leverages multi-view improved monitored distillation to generate more precise depth maps. Our approach builds upon the state-of-the-art ensemble distillation method, in which we introduce a stereo-based model as a teacher model to improve the accuracy of the student model for depth completion. By minimizing the reconstruction error for a given image during ensemble distillation, we can avoid learning inherent error modes of completion-based teachers. To provide self-supervised information, we also employ multi-view depth consistency and multi-scale minimum reprojection. These techniques utilize existing structural constraints to yield supervised signals for student model training, without requiring costly ground truth depth information. Our extensive experimental evaluation demonstrates that our proposed method significantly improves the accuracy of the baseline monitored distillation method.
翻译:本文提出了一种新的深度完成方法,它利用多视角监督蒸馏来生成更精确的深度图。我们的方法建立在最先进的集成蒸馏方法之上,引入基于立体的教师模型来提高深度完成的学生模型的准确性。通过在集成蒸馏期间最小化给定图像的重建误差,我们可以避免学习基于完成的教师的固有错误模式。为了提供自我监督信息,我们还采用多视角深度一致性和多尺度最小重投影。这些技术利用现有的结构约束产生学生模型训练的监督信号,而无需昂贵的地面实况深度信息。我们进行了广泛的实验评估,证明了我们提出的方法显著提高了基线监督蒸馏方法的准确性。