Convolutional neural networks (CNNs) have achieved superhuman performance in multiple vision tasks, especially image classification. However, unlike humans, CNNs leverage spurious features, such as background information to make decisions. This tendency creates different problems in terms of robustness or weak generalization performance. Through our work, we introduce a contrastive learning-based approach (CLAD) to mitigate the background bias in CNNs. CLAD encourages semantic focus on object foregrounds and penalizes learning features from irrelavant backgrounds. Our method also introduces an efficient way of sampling negative samples. We achieve state-of-the-art results on the Background Challenge dataset, outperforming the previous benchmark with a margin of 4.1\%. Our paper shows how CLAD serves as a proof of concept for debiasing of spurious features, such as background and texture (in supplementary material).
翻译:革命神经网络(CNNs)在多种视觉任务中取得了超人性的表现,特别是图像分类。然而,与人类不同,CNN利用假的特征,如背景信息来作出决定。这种倾向在稳健性或概括性表现方面造成了不同的问题。我们通过我们的工作,引入了一种反常的学习方法(CLAD)来减轻CNN的背景偏见。CLAD鼓励对目标源地进行语义分析,并惩罚来自不稳定背景的学习特征。我们的方法还引入了对负面样本进行有效取样的方法。我们在背景挑战数据集上取得了最新的结果,比先前的基准差4.1%差。我们的文件展示了CLAD如何成为对虚假特征,如背景和文字(补充材料)进行贬低偏见概念的证明。