Adversarial robustness has become a central goal in deep learning, both in the theory and the practice. However, successful methods to improve the adversarial robustness (such as adversarial training) greatly hurt generalization performance on the unperturbed data. This could have a major impact on how the adversarial robustness affects real world systems (i.e. many may opt to forego robustness if it can improve accuracy on the unperturbed data). We propose Interpolated Adversarial Training, which employs recently proposed interpolation based training methods in the framework of adversarial training. On CIFAR-10,adversarial training increases the standard test error (when there is no adversary) from 4.43% to 12.32%, whereas with our Interpolated adversarial training we retain the adversarial robustness while achieving a standard test error of only 6.45%. With our technique, the relative increase in the standard error for the robust model is reduced from 178.1% to just 45.5%. Moreover, we provide mathematical analysis of Interpolated Adversarial Training to confirm its efficiencies and demonstrate its advantages in terms of robustness and generalization.
翻译:在理论和实践方面,对抗性强力已成为深层次学习的中心目标。然而,改进对抗性强力的成功方法(如对抗性培训)极大地损害了未受干扰数据的一般性表现。这可能对对抗性强力如何影响现实世界系统产生重大影响(即如果能够提高未受干扰数据的准确性,许多人可能会选择放弃强力)。我们提议国际刑警对抗性培训,在对抗性培训框架内采用最近提出的内插性培训方法。在CIFAR-10方面,对抗性培训将标准测试错误(当没有对手时)从4.43%提高到12.32 %,而随着我们的国际刑警对抗性培训,我们保留对抗性强力,同时只达到6.45%的标准测试错误。用我们的技术,强力模型标准错误的相对增加率从178.1%下降到45.5%。此外,我们对国际刑警对抗性培训进行数学分析,以证实其效率,并展示其在稳健性和普遍化方面的优势。