Contrastive learning (CL) has recently emerged as an effective approach to learning representation in a range of downstream tasks. Central to this approach is the selection of positive (similar) and negative (dissimilar) sets to provide the model the opportunity to `contrast' between data and class representation in the latent space. In this paper, we investigate CL for improving model robustness using adversarial samples. We first designed and performed a comprehensive study to understand how adversarial vulnerability behaves in the latent space. Based on this empirical evidence, we propose an effective and efficient supervised contrastive learning to achieve model robustness against adversarial attacks. Moreover, we propose a new sample selection strategy that optimizes the positive/negative sets by removing redundancy and improving correlation with the anchor. Extensive experiments show that our Adversarial Supervised Contrastive Learning (ASCL) approach achieves comparable performance with the state-of-the-art defenses while significantly outperforms other CL-based defense methods by using only $42.8\%$ positives and $6.3\%$ negatives.
翻译:最近,对立学习(CL)已成为在一系列下游任务中学习代表性的一种有效方法。这一方法的核心是选择正(类似)和负(不同)组合,为潜在空间的数据和阶级代表性提供“对调”的示范机会;在本文件中,我们调查CL,利用对抗样品改进模型的稳健性;我们首先设计并进行了全面研究,以了解对立脆弱性在潜伏空间中如何表现;根据这一经验证据,我们提议进行有效和高效的监督对比学习,以实现对对抗性攻击的模型稳健;此外,我们提议采用新的抽样选择战略,通过消除冗余和改进与锚的关联,优化正反向组合。广泛的实验表明,我们的反向超常对立学习(ASCL)方法取得了与最先进的防御方法相当的成绩,同时仅使用42.8美元正值和6.3美元负值,大大优于其他基于CL的防御方法。