Contrastive learning has led to substantial improvements in the quality of learned embedding representations for tasks such as image classification. However, a key drawback of existing contrastive augmentation methods is that they may lead to the modification of the image content which can yield undesired alterations of its semantics. This can affect the performance of the model on downstream tasks. Hence, in this paper, we ask whether we can augment image data in contrastive learning such that the task-relevant semantic content of an image is preserved. For this purpose, we propose to leverage saliency-based explanation methods to create content-preserving masked augmentations for contrastive learning. Our novel explanation-driven supervised contrastive learning (ExCon) methodology critically serves the dual goals of encouraging nearby image embeddings to have similar content and explanation. To quantify the impact of ExCon, we conduct experiments on the CIFAR-100 and the Tiny ImageNet datasets. We demonstrate that ExCon outperforms vanilla supervised contrastive learning in terms of classification, explanation quality, adversarial robustness as well as calibration of probabilistic predictions of the model in the context of distributional shift.
翻译:对比性学习导致图像分类等任务中学习嵌入式质量的大幅提高。然而,现有对比性增强方法的一个主要缺点是,这些方法可能导致修改图像内容,从而不希望看到其语义的改变。这可能影响下游任务模型的性能。因此,在本文件中,我们询问,我们是否能够通过对比性学习来增加图像数据,从而保持与任务相关的图像语义内容。为此,我们提议利用基于显著性的解释方法,为对比性学习创建内容保护面罩增强。我们的新颖的解释驱动的监管对比性学习(Excon)方法对于鼓励附近图像嵌入具有类似内容和解释的双重目标起到了关键的作用。为了量化ExCon的影响,我们对CIFAR-100和Tiny图像网络数据集进行了实验。我们证明,ExCon在分类、解释质量、对抗性坚固性以及校准分布式转移背景下模型的概率性预测方面,将受监督的对比性学习设置为范式。