Unsupervised learning has recently made exceptional progress because of the development of more effective contrastive learning methods. However, CNNs are prone to depend on low-level features that humans deem non-semantic. This dependency has been conjectured to induce a lack of robustness to image perturbations or domain shift. In this paper, we show that by generating carefully designed negative samples, contrastive learning can learn more robust representations with less dependence on such features. Contrastive learning utilizes positive pairs that preserve semantic information while perturbing superficial features in the training images. Similarly, we propose to generate negative samples in a reversed way, where only the superfluous instead of the semantic features are preserved. We develop two methods, texture-based and patch-based augmentations, to generate negative samples. These samples achieve better generalization, especially under out-of-domain settings. We also analyze our method and the generated texture-based samples, showing that texture features are indispensable in classifying particular ImageNet classes and especially finer classes. We also show that model bias favors texture and shape features differently under different test settings. Our code, trained models, and ImageNet-Texture dataset can be found at https://github.com/SongweiGe/Contrastive-Learning-with-Non-Semantic-Negatives.
翻译:由于开发了更有效的对比学习方法,未受监督的学习最近取得了特殊的进展。然而,CNN的学习由于开发了更有效的对比性学习方法,最近取得了非常显著的进展。然而,CNN的学习利用了保留语义信息的正面对子,从而保留了培训图像中的表面特征。同样,我们提议以反向方式生成负面样本,其中只保留了语义特征,而只保留了多余的。我们开发了两种方法,即基于纹理的和基于补丁的增强功能,以生成负面样本。在本文中,我们显示通过生成精心设计的负面样本,对比性学习可以学到更强的表达方式,而较少依赖这些特征。我们还分析了我们的方法和生成的基于文本的样本,表明在对特定的图像网络类进行分类时,特别是细微的类中,我们还提议以相反的方式生成负面的样本,只有多余的样本,而不是保留语义特征。我们开发了两种方法,即基于纹理的和基于补丁的增强功能的增强功能,以便生成负面的样本。我们所训练的代码模型和生成的样本可以在不同的测试环境中找到的MAG/NE型数据模型。