The advanced performance of depth estimation is achieved by the employment of large and complex neural networks. While the performance is still being continuously improved, we argue that the depth estimation has to be efficient as well since it is a preliminary requirement for real-world applications. However, fast depth estimation tends to lower the performance as the trade-off between the model's capacity and accuracy. In this paper, we aim to achieve accurate depth estimation with a light-weight network. To this end, we first introduce a highly compact network that can estimate a depth map in real-time. We then develop a knowledge distillation paradigm to further improve the performance. We observe that many scenarios have the same scene scales in real-world, yielding similar depth histograms, thus they are potentially valuable and applicable to develop a better learning strategy. Therefore, we propose to employ auxiliary unlabeled/labeled data to improve knowledge distillation. Through extensive and rigorous experiments, we show that our method can achieve comparable performance against state-of-the-of-art methods with only 1% parameters, and outperforms previous light-weight methods in terms of inference accuracy, computational efficiency and generalizability.
翻译:深度估测的先进性能是通过使用大型和复杂的神经网络来实现的。虽然这种性能仍在不断改进,但我们认为深度估测也必须有效,因为这是现实世界应用的初步要求。然而,快速深度估测往往会降低性能,因为模型的能力与准确性之间的权衡。在本文中,我们的目标是通过轻量级网络实现准确的深度估测。为此,我们首先引入一个高度紧凑的网络,可以实时估计深度图。我们随后开发了一个知识蒸馏模式,以进一步改进性能。我们观察到,许多情景在现实世界中具有相同的场景尺度,产生类似的深度直方图,因此它们具有潜在的价值,并适用于制定更好的学习战略。因此,我们提议使用辅助的无标签/标签数据来改进知识蒸馏。我们通过广泛而严格的实验,表明我们的方法可以达到与仅有1%参数的先进方法的可比性性能,在推断准确性、计算效率和一般性方面优于以前的轻度方法。