In recent years, there has been an explosion of research into developing more robust deep neural networks against adversarial examples. Adversarial training appears as one of the most successful methods. To deal with both the robustness against adversarial examples and the accuracy over clean examples, many works develop enhanced adversarial training methods to achieve various trade-offs between them. Leveraging over the studies that smoothed update on weights during training may help find flat minima and improve generalization, we suggest reconciling the robustness-accuracy trade-off from another perspective, i.e., by adding random noise into deterministic weights. The randomized weights enable our design of a novel adversarial training method via Taylor expansion of a small Gaussian noise, and we show that the new adversarial training method can flatten loss landscape and find flat minima. With PGD, CW, and Auto Attacks, an extensive set of experiments demonstrate that our method enhances the state-of-the-art adversarial training methods, boosting both robustness and clean accuracy. The code is available at https://github.com/Alexkael/Randomized-Adversarial-Training.
翻译:近年来,针对对抗性例子开发更强大的深度神经网络的研究突飞猛进。对抗训练是其中最成功的方法之一。为了处理对抗性例子的鲁棒性以及对干净例子的准确性,许多工作开发了增强的对抗训练方法,以实现它们之间的各种权衡。利用平滑地更新权重的研究,可以帮助找到平坦的极小值并提高泛化能力,我们建议从另一个角度来解决鲁棒性-准确性平衡,即通过在确定性权重中添加随机噪声。随机化权重使我们能够通过小的高斯噪声的泰勒展开设计新的对抗训练方法,并且我们表明,新的对抗训练方法可以使损失函数平滑,并找到平坦的极小值。通过PGD,CW和自动攻击,大量实验表明我们的方法增强了最先进的对抗训练方法,提高了鲁棒性和准确性。 代码可在 https://github.com/Alexkael/Randomized-Adversarial-Training 上获取。