It is by now well-known that small adversarial perturbations can induce classification errors in deep neural networks. In this paper, we take a bottom-up signal processing perspective to this problem and show that a systematic exploitation of sparsity in natural data is a promising tool for defense. For linear classifiers, we show that a sparsifying front end is provably effective against $\ell_{\infty}$-bounded attacks, reducing output distortion due to the attack by a factor of roughly $K/N$ where $N$ is the data dimension and $K$ is the sparsity level. We then extend this concept to deep networks, showing that a "locally linear" model can be used to develop a theoretical foundation for crafting attacks and defenses. We also devise attacks based on the locally linear model that outperform the well-known FGSM attack. We supplement our theoretical results with experiments on the MNIST and CIFAR-10 datasets, showing the efficacy of the proposed sparsity-based defense schemes.
翻译:众所周知, 小对抗性扰动可以诱发深神经网络的分类错误。 在本文中, 我们从自下而上的信号处理角度来看待这一问题, 并表明系统利用自然数据中的宽度是一个很好的防御工具。 对于线性分类者, 我们显示, 覆盖前端的功能对于$\ ell ⁇ infty} 受限制的攻击非常有效, 减少由于攻击而导致的产出扭曲, 减少因攻击的系数为约K/N$, 即数据维度为N美元, 聚度为$$。 然后, 我们把这个概念扩展至深网络, 显示“ 局部线性” 模型可以用来开发设计攻击和防御的理论基础。 我们还设计了以本地线性模型为基础的攻击, 超越了众所周知的FGSM攻击。 我们用关于MNIST和CIFAR- 10数据集的实验来补充我们的理论结果, 展示了拟议的基于神经性防御计划的功效。