Deep networks are well-known to be fragile to adversarial attacks, and adversarial training is one of the most popular methods used to train a robust model. To take advantage of unlabeled data, recent works have applied adversarial training to contrastive learning (Adversarial Contrastive Learning; ACL for short) and obtain promising robust performance. However, the theory of ACL is not well understood. To fill this gap, we leverage the Rademacher complexity to analyze the generalization performance of ACL, with a particular focus on linear models and multi-layer neural networks under $\ell_p$ attack ($p \ge 1$). Our theory shows that the average adversarial risk of the downstream tasks can be upper bounded by the adversarial unsupervised risk of the upstream task. The experimental results validate our theory.
翻译:众所周知,深层网络在对抗性攻击面前十分脆弱,而对抗性培训是用来训练强健模型的最流行方法之一。为了利用未贴标签的数据,最近的工作运用对抗性培训来进行对比性学习(Adversarial Contracting;ACL for short)和取得有希望的强力表现。然而,ACL的理论并没有得到很好的理解。为了填补这一空白,我们利用Rademacher的复杂性来分析ACL的普及性表现,特别侧重于线性模型和多层神经网络在$@ell_p$攻击下($p\ge 1$)下。我们的理论表明下游任务的平均对抗性风险可以由对抗性非监督性的上游任务风险压倒。实验结果证实了我们的理论。