We study data-free knowledge distillation (KD) for monocular depth estimation (MDE), which learns a lightweight network for real-world depth perception by compressing from a trained expert model under the teacher-student framework while lacking training data in the target domain. Owing to the essential difference between dense regression and image recognition, previous methods of data-free KD are not applicable to MDE. To strengthen the applicability in the real world, in this paper, we seek to apply KD with out-of-distribution simulated images. The major challenges are i) lacking prior information about object distribution of the original training data; ii) the domain shift between the real world and the simulation. To cope with the first difficulty, we apply object-wise image mixing to generate new training samples for maximally covering distributed patterns of objects in the target domain. To tackle the second difficulty, we propose to utilize a transformation network that efficiently learns to fit the simulated data to the feature distribution of the teacher model. We evaluate the proposed approach for various depth estimation models and two different datasets. As a result, our method outperforms the baseline KD by a good margin and even achieves slightly better performance with as few as $1/6$ images, demonstrating a clear superiority.
翻译:我们研究没有数据的知识蒸馏(KD),以进行单眼深度估计(MDE),通过在缺乏目标领域培训数据的情况下,从师学生框架下的训练有素的专家模型中压缩,在缺乏目标领域培训数据的同时,从师学生框架下的训练有素的专家模型中压缩,学习一个轻量网络,以了解真实世界深度感知,通过压缩从师学生框架下的训练有素的专家模型,学习现实世界深度感知的轻量网络。由于密集回归和图像识别之间的根本差异,以前没有数据的知识蒸馏方法不适用于MDE。为了在现实世界中加强应用模拟图像,我们在本文件中寻求将模拟数据应用到外部的模拟图像。主要的挑战是:一)缺乏关于原始培训数据对象分布的事先信息;二)现实世界与模拟之间的域变化。为了应对第一个困难,我们应用目标图像的客观图像混合方法生成新的培训样本,以最大限度地覆盖目标领域物体分布的分布模式;为了应对第二个困难,我们建议利用一个改造网络,高效地学习模拟数据与教师模型的特征分布相匹配。我们评估各种深度估算模型和两个不同的数据集的拟议方法。我们评估了各种深度估计模型和两个不同的数据集的拟议方法,甚至以轻微的美元比标准差差差差更远,以展示1D,以示范1美元比1美元比1美元比1美元更清楚的1美元,还1美元,以1美元更明显1美元。