Adversarial Attacks are still a significant challenge for neural networks. Recent work has shown that adversarial perturbations typically contain high-frequency features, but the root cause of this phenomenon remains unknown. Inspired by theoretical work on linear full-width convolutional models, we hypothesize that the local (i.e. bounded-width) convolutional operations commonly used in current neural networks are implicitly biased to learn high frequency features, and that this is one of the root causes of high frequency adversarial examples. To test this hypothesis, we analyzed the impact of different choices of linear and nonlinear architectures on the implicit bias of the learned features and the adversarial perturbations, in both spatial and frequency domains. We find that the high-frequency adversarial perturbations are critically dependent on the convolution operation because the spatially-limited nature of local convolutions induces an implicit bias towards high frequency features. The explanation for the latter involves the Fourier Uncertainty Principle: a spatially-limited (local in the space domain) filter cannot also be frequency-limited (local in the frequency domain). Furthermore, using larger convolution kernel sizes or avoiding convolutions (e.g. by using Vision Transformers architecture) significantly reduces this high frequency bias, but not the overall susceptibility to attacks. Looking forward, our work strongly suggests that understanding and controlling the implicit bias of architectures will be essential for achieving adversarial robustness.
翻译:反向袭击仍然是神经网络面临的重大挑战。 最近的工作表明,对抗性扰动通常包含高频率特征,但这一现象的根源仍然未知。受线性全宽进化模型理论工作启发,我们假设当前神经网络通常使用的地方(即捆绑-线性)进化演动操作隐含偏见,以学习高频率特征,这是高频对抗性辩论性实例的根源之一。为测试这一假设,我们分析了线性和非线性结构的不同选择对空间和频率领域所学特征和对抗性扰动的隐含偏差的影响。我们发现,高频对抗性扰动非常依赖共变动操作,因为当地神经网络通常使用的空间有限性导致对高频率特征的隐含偏差。对后者的解释是四度不稳性原则:空间范围有限(空间域的本地)过滤器也不能对频率有限制(频率范围内的本地)和对抗性扰动性干扰。此外,使用更深的频率性变异性结构,可以大大降低这种变异性。