Adversarial Robustness Distillation (ARD) is a novel method to boost the robustness of small models. Unlike general adversarial training, its robust knowledge transfer can be less easily restricted by the model capacity. However, the teacher model that provides the robustness of knowledge does not always make correct predictions, interfering with the student's robust performances. Besides, in the previous ARD methods, the robustness comes entirely from one-to-one imitation, ignoring the relationship between examples. To this end, we propose a novel structured ARD method called Contrastive Relationship DeNoise Distillation (CRDND). We design an adaptive compensation module to model the instability of the teacher. Moreover, we utilize the contrastive relationship to explore implicit robustness knowledge among multiple examples. Experimental results on multiple attack benchmarks show CRDND can transfer robust knowledge efficiently and achieves state-of-the-art performances.
翻译:强力蒸馏(ARD)是提高小型模型稳健性的一种新颖方法。 与一般的对抗性培训不同,它的强力知识转移不那么容易受模型能力的限制。 但是,提供强力知识的教师模式并不总是作出正确的预测,干扰学生的强力表现。 此外,在以往的ARD方法中,强力完全来自一对一的模仿,忽视了实例之间的关系。 为此,我们提议了一种新型结构化的ARD方法,称为 " 对抗性关系蒸馏(CRDND) " 。我们设计了一个适应性补偿模块,以模拟教师的不稳定性。此外,我们利用对比性关系来探索多个例子之间的隐含强力知识。 多起攻击基准的实验结果显示,CRDND能够有效地转让强力知识并实现最先进的业绩。