Adversarial robustness is an open challenge in deep learning, most often tackled using adversarial training. Adversarial training is computationally costly, involving alternated optimization with a trade-off between standard generalization and adversarial robustness. We explore training robust models without adversarial training by revisiting a known result linking maximally robust classifiers and minimum norm solutions, and combining it with recent results on the implicit bias of optimizers. First, we show that, under certain conditions, it is possible to achieve both perfect standard accuracy and a certain degree of robustness without a trade-off, simply by training an overparameterized model using the implicit bias of the optimization. In that regime, there is a direct relationship between the type of the optimizer and the attack to which the model is robust. Second, we investigate the role of the architecture in designing robust models. In particular, we characterize the robustness of linear convolutional models, showing that they resist attacks subject to a constraint on the Fourier-$\ell_\infty$ norm. This result explains the property of $\ell_p$-bounded adversarial perturbations that tend to be concentrated in the Fourier domain. This leads us to a novel attack in the Fourier domain that is inspired by the well-known frequency-dependent sensitivity of human perception. We evaluate Fourier-$\ell_\infty$ robustness of recent CIFAR-10 models with robust training and visualize adversarial perturbations.
翻译:在深层次学习中,Adversarial 稳健性是公开的挑战,通常使用对抗性培训解决。 反向培训在计算上成本高昂,涉及在标准一般化和对抗性稳健性之间进行交替的优化,在不进行对抗性培训的情况下,我们探索不进行对抗性培训的强健模式。 我们探索不进行对抗性培训的强健模式,方法是重新审视一个已知的结果,将最强的分类者和最低规范解决方案联系起来,并将之与关于优化者隐含的偏差的最新结果结合起来。 首先,我们表明,在某些条件下,可以实现完美的标准准确性和某种程度的稳健性,而不进行权衡。 仅通过利用优化的隐含偏差来培训一个过度量化的模型。 在该制度中,优化型模型的类型与强健健健的打击之间存在直接关系。 其次,我们调查结构在设计强健健健型模型中的作用。 特别是线性革命模型的稳健性,表明它们抵制攻击的力度取决于Fourier-ell-furfin-inty ftytytytyty cruty cruty cruty 规范。这一结果解释了美元约束性强性强性强度,这是我们四频度新频域的快速度评估。