Recently, unsupervised adversarial training (AT) has been extensively studied to attain robustness with the models trained upon unlabeled data. To this end, previous studies have applied existing supervised adversarial training techniques to self-supervised learning (SSL) frameworks. However, all have resorted to untargeted adversarial learning as obtaining targeted adversarial examples is unclear in the SSL setting lacking of label information. In this paper, we propose a novel targeted adversarial training method for the SSL frameworks. Specifically, we propose a target selection algorithm for the adversarial SSL frameworks; it is designed to select the most confusing sample for each given instance based on similarity and entropy, and perturb the given instance toward the selected target sample. Our method significantly enhances the robustness of an SSL model without requiring large batches of images or additional models, unlike existing works aimed at achieving the same goal. Moreover, our method is readily applicable to general SSL frameworks that only uses positive pairs. We validate our method on benchmark datasets, on which it obtains superior robust accuracies, outperforming existing unsupervised adversarial training methods.
翻译:最近,对未经监督的对抗性培训(AT)进行了广泛研究,以在未贴标签数据基础上培训的模型实现稳健。为此目的,以往的研究已经将现有的受监督的对抗性培训技术应用于自我监督的学习框架。然而,所有的研究都采用非有针对性的对抗性学习,因为获得有针对性的敌对性实例在SSL设置中并不清楚,缺乏标签信息。在本文件中,我们为SSL框架提出了一个新的、有针对性的对抗性培训方法。具体地说,我们为对抗性SSL框架提出了一个目标选择算法;它旨在根据相似性和灵敏性为每个特定实例选择最混淆的样本,并将给选定的目标样本设置一个实例。我们的方法大大增强了SSL模型的稳健性,而不需要大量图像或额外模型,而与现有旨在达到同一目标的工程不同。此外,我们的方法很容易适用于仅使用正对方的SSL框架。我们验证了我们的基准数据集方法,并据此获得超强的精准的精准度,超越了现有的未经监督的敌对性对抗性培训方法。