Deep neural networks continue to awe the world with their remarkable performance. Their predictions, however, are prone to be corrupted by adversarial examples that are imperceptible to humans. Current efforts to improve the robustness of neural networks against adversarial examples are focused on developing robust training methods, which update the weights of a neural network in a more robust direction. In this work, we take a step beyond training of the weight parameters and consider the problem of designing an adversarially robust neural architecture with high intrinsic robustness. We propose AdvRush, a novel adversarial robustness-aware neural architecture search algorithm, based upon a finding that independent of the training method, the intrinsic robustness of a neural network can be represented with the smoothness of its input loss landscape. Through a regularizer that favors a candidate architecture with a smoother input loss landscape, AdvRush successfully discovers an adversarially robust neural architecture. Along with a comprehensive theoretical motivation for AdvRush, we conduct an extensive amount of experiments to demonstrate the efficacy of AdvRush on various benchmark datasets. Notably, on CIFAR-10, AdvRush achieves 55.91% robust accuracy under FGSM attack after standard training and 50.04% robust accuracy under AutoAttack after 7-step PGD adversarial training.
翻译:深心神经网络继续以其惊人的性能而令世界感到震撼。然而,它们的预测却容易被人类无法察觉的对抗性实例腐蚀。目前改善神经网络抵御对抗对抗性实例的坚固性的努力侧重于开发强大的培训方法,这些方法将神经网络的权重更新到更稳健的方向上。在这项工作中,我们迈出了超越重量参数培训的一步,并审议了设计一个具有高度内在强健性强力的敌对性强健的神经结构的问题。我们提出了AdvRush,这是一个新的对抗性强力强力神经结构搜索算法,其基础是发现独立于培训方法,神经网络内在的稳健性坚固性神经网络可以随着其输入损失场景的光滑性而体现。通过一个支持具有更平稳投入损失场景的候选结构的正规化机制,AdvRush成功地发现了一个强大的对抗性坚固的神经结构。除了AdvRush的全面理论动机外,我们还进行了广泛的实验,以展示AdvRush在各种基准数据设置下A 稳健健的P-10的精确度培训之后,在CFFFFFAR04A的精确度下实现了。