Neural networks are vulnerable to adversarial attacks: adding well-crafted, imperceptible perturbations to their input can modify their output. Adversarial training is one of the most effective approaches in training robust models against such attacks. However, it is much slower than vanilla training of neural networks since it needs to construct adversarial examples for the entire training data at every iteration, which has hampered its effectiveness. Recently, Fast Adversarial Training was proposed that can obtain robust models efficiently. However, the reasons behind its success are not fully understood, and more importantly, it can only train robust models for $\ell_\infty$-bounded attacks as it uses FGSM during training. In this paper, by leveraging the theory of coreset selection we show how selecting a small subset of training data provides a more principled approach towards reducing the time complexity of robust training. Unlike existing methods, our approach can be adapted to a wide variety of training objectives, including TRADES, $\ell_p$-PGD, and Perceptual Adversarial Training. Our experimental results indicate that our approach speeds up adversarial training by 2-3 times, while experiencing a small reduction in the clean and robust accuracy.
翻译:神经网络很容易受到对抗性攻击:增加精心设计的、无法察觉的干扰,其投入可以改变其产出。反向培训是培训抵御这类攻击的强大模型的最有效方法之一。然而,它比神经网络的香草培训要慢得多,因为它需要在每个迭代中为整个培训数据建立对抗性实例,这妨碍了其效力。最近,提出了快速反向培训,能够有效地获得稳健模型。然而,其成功的原因尚未得到充分理解,更重要的是,它只能为在培训中使用FGSM时使用$\ ⁇ infty$受约束的攻击培训,而这种培训是最有效的方法。在本文中,通过利用核心选择理论,我们展示了如何选择少量的培训数据为减少强力培训的复杂时间提供更具原则性的方法。与现有的方法不同,我们的方法可以适应广泛的培训目标,包括TraiceS、$\ell_p$_p$-PGGD和概念性反向培训。我们的实验结果表明,我们的方法在小规模的削减过程中,在2-3次的简单减少中,其精确性培训的速度会加快。