It is commonly believed that networks cannot be both accurate and robust, that gaining robustness means losing accuracy. It is also generally believed that, unless making networks larger, network architectural elements would otherwise matter little in improving adversarial robustness. Here we present evidence to challenge these common beliefs by a careful study about adversarial training. Our key observation is that the widely-used ReLU activation function significantly weakens adversarial training due to its non-smooth nature. Hence we propose smooth adversarial training (SAT), in which we replace ReLU with its smooth approximations to strengthen adversarial training. The purpose of smooth activation functions in SAT is to allow it to find harder adversarial examples and compute better gradient updates during adversarial training. Compared to standard adversarial training, SAT improves adversarial robustness for "free", i.e., no drop in accuracy and no increase in computational cost. For example, without introducing additional computations, SAT significantly enhances ResNet-50's robustness from 33.0% to 42.3%, while also improving accuracy by 0.9% on ImageNet. SAT also works well with larger networks: it helps EfficientNet-L1 to achieve 82.2% accuracy and 58.6% robustness on ImageNet, outperforming the previous state-of-the-art defense by 9.5% for accuracy and 11.6% for robustness. Models are available at https://github.com/cihangxie/SmoothAdversarialTraining.
翻译:人们普遍认为,网络不能既准确又稳健,因此获得强力意味着失去准确性。一般也认为,除非使网络变得更大,否则网络建筑要素在提高对抗性强度方面没有什么作用。我们在这里通过认真研究对抗性培训来证明这些共同的信念。我们的主要观察是,广泛使用的RELU激活功能因其非松散性质而大大削弱了对抗性培训。因此,我们提议平稳的对抗性培训(SAT),在其中,我们用其平稳的近似来取代ReLU,以加强对抗性培训。在SAT中,顺利启动功能的目的是使它能够找到更难的对抗性实例,并在对抗性培训期间计算更好的梯度更新。与标准的对抗性培训相比,SAT提高了对抗性强势性强度,即“免费”,即准确性没有下降,计算成本也没有增加。例如,在不引入额外计算的情况下,SAT显著提高ResNet-50的强度,从33.0%提高到42.3%,同时提高图像网络的准确性,同时提高0.9%。SAT还以更大的网络为基础,在先前的准确性上工作:它有助于提高网络-Net-L1的准确性,使图像-9.5 %的准确性实现。