Recently, Wong et al. showed that adversarial training with single-step FGSM leads to a characteristic failure mode named catastrophic overfitting (CO), in which a model becomes suddenly vulnerable to multi-step attacks. They showed that adding a random perturbation prior to FGSM (RS-FGSM) seemed to be sufficient to prevent CO. However, Andriushchenko and Flammarion observed that RS-FGSM still leads to CO for larger perturbations, and proposed an expensive regularizer (GradAlign) to avoid CO. In this work, we methodically revisit the role of noise and clipping in single-step adversarial training. Contrary to previous intuitions, we find that using a stronger noise around the clean sample combined with not clipping is highly effective in avoiding CO for large perturbation radii. Based on these observations, we then propose Noise-FGSM (N-FGSM) that, while providing the benefits of single-step adversarial training, does not suffer from CO. Empirical analyses on a large suite of experiments show that N-FGSM is able to match or surpass the performance of previous single-step methods while achieving a 3$\times$ speed-up. Code can be found in https://github.com/pdejorge/N-FGSM
翻译:最近,Wong等人指出,使用单一步骤的FGSM的对抗性训练导致一个称为灾难性过度装配(CO)的典型失败模式,在这个模式中,一个模型突然容易受到多步攻击,它们表明,在FGSM(RS-FGSM)之前添加随机扰动似乎足以防止CO;然而,Andriushchenko和Flammarion指出,RS-FGSM仍然会导致更大的扰动,并提议为避免CO提供昂贵的常规化器(GradAlign)。在这项工作中,我们有条不紊乱地重新审视在单步对抗性训练中噪音和剪动的作用。与以往的直觉相反,我们发现在使用清洁样品和不剪动之前使用更强烈的噪音,对于大型扰动性辐射避免CO是极为有效的。根据这些观察,我们然后建议Nise-FSMSM(N-FGSM)在提供单步对抗性训练的好处的同时,不能从CO.Epicalcalal分析大型实验系列中显示N-FSMMMMM/CUT在前一个速度中可以匹配或超过前一个速度。