Cost-effective depth and infrared sensors as alternatives to usual RGB sensors are now a reality, and have some advantages over RGB in domains like autonomous navigation and remote sensing. As such, building computer vision and deep learning systems for depth and infrared data are crucial. However, large labeled datasets for these modalities are still lacking. In such cases, transferring knowledge from a neural network trained on a well-labeled large dataset in the source modality (RGB) to a neural network that works on a target modality (depth, infrared, etc.) is of great value. For reasons like memory and privacy, it may not be possible to access the source data, and knowledge transfer needs to work with only the source models. We describe an effective solution, SOCKET: SOurce-free Cross-modal KnowledgE Transfer for this challenging task of transferring knowledge from one source modality to a different target modality without access to task-relevant source data. The framework reduces the modality gap using paired task-irrelevant data, as well as by matching the mean and variance of the target features with the batch-norm statistics that are present in the source models. We show through extensive experiments that our method significantly outperforms existing source-free methods for classification tasks which do not account for the modality gap.
翻译:具有成本效益的深度和红外感应器作为常规RGB传感器的替代物,现已成为现实,在自主导航和遥感等领域比RGB具有一些优势。因此,为深度和红外数据建立计算机视野和深层学习系统至关重要。然而,这些模式仍然缺乏大标记的数据集。在这种情况下,利用源模式(RGB)中贴有良好标签的大数据集培训的神经网络向在目标模式(深度、红外等)上运行的神经网络转移知识,具有巨大价值。由于记忆和隐私等原因,可能无法获取源数据,知识转让需要仅与源模型合作。我们描述了一种有效的解决方案,SOCKET:无SOURce-Crosmal KnewledledGE: 将知识从一种源模式转移到不同的目标模式,而没有任务相关源数据。框架通过配对任务相关数据,并通过将目标特征的平均值和差异与源码统计相匹配,而源模型中则没有这种差异。我们通过广泛的实验方法展示了现有源码的分类方法。