Adversarial training methods are state-of-the-art (SOTA) empirical defense methods against adversarial examples. Many regularization methods have been proven to be effective with the combination of adversarial training. Nevertheless, such regularization methods are implemented in the time domain. Since adversarial vulnerability can be regarded as a high-frequency phenomenon, it is essential to regulate the adversarially-trained neural network models in the frequency domain. Faced with these challenges, we make a theoretical analysis on the regularization property of wavelets which can enhance adversarial training. We propose a wavelet regularization method based on the Haar wavelet decomposition which is named Wavelet Average Pooling. This wavelet regularization module is integrated into the wide residual neural network so that a new WideWaveletResNet model is formed. On the datasets of CIFAR-10 and CIFAR-100, our proposed Adversarial Wavelet Training method realizes considerable robustness under different types of attacks. It verifies the assumption that our wavelet regularization method can enhance adversarial robustness especially in the deep wide neural networks. The visualization experiments of the Frequency Principle (F-Principle) and interpretability are implemented to show the effectiveness of our method. A detailed comparison based on different wavelet base functions is presented. The code is available at the repository: \url{https://github.com/momo1986/AdversarialWaveletTraining}.
翻译:对抗性训练方法是针对对抗性训练的先进(SOTA)实验性防御方法。许多正规化方法已证明与对抗性训练相结合是有效的。不过,这种正规化方法是在时间范围内实施的。由于对抗性弱点可被视为一种高频率现象,因此有必要规范频域内经过敌对性训练的神经网络模型。面对这些挑战,我们从理论上分析波子的正规化属性,这可以加强对抗性训练。我们建议一种波盘正规化方法,它基于名为波莱平均集合的海尔波盘分解法。这个波盘正规化模块被纳入宽余线网络,从而形成一个新的宽广的WideWaveletResNet模式。在CIFAR-10和CIFAR-100的数据集中,我们提议的Aversarial Wavelet培训方法在不同类型的攻击中具有相当大的强势性。它验证了这样的假设,即我们的波盘调节方法可以加强对抗性强性,特别是在深宽的神经网络中。对频率原则(F-Broad)的可视化实验(F-logy)和可解释性工具基础的可判读性功能以显示我们所展示的A-harbur/Riversalstalstal 的系统基底基的系统。