Robustness of deep neural networks against adversarial perturbations is a pressing concern motivated by recent findings showing the pervasive nature of such vulnerabilities. One method of characterizing the robustness of a neural network model is through its Lipschitz constant, which forms a robustness certificate. A natural question to ask is, for a fixed model class (such as neural networks) and a dataset of size $n$, what is the smallest achievable Lipschitz constant among all models that fit the dataset? Recently, (Bubeck et al., 2020) conjectured that when using two-layer networks with $k$ neurons to fit a generic dataset, the smallest Lipschitz constant is $\Omega(\sqrt{\frac{n}{k}})$. This implies that one would require one neuron per data point to robustly fit the data. In this work we derive a lower bound on the Lipschitz constant for any arbitrary model class with bounded Rademacher complexity. Our result coincides with that conjectured in (Bubeck et al., 2020) for two-layer networks under the assumption of bounded weights. However, due to our result's generality, we also derive bounds for multi-layer neural networks, discovering that one requires $\log n$ constant-sized layers to robustly fit the data. Thus, our work establishes a law of robustness for weight bounded neural networks and provides formal evidence on the necessity of over-parametrization in deep learning.
翻译:深度神经网络对对抗性扰动的坚固性是最近发现这类脆弱性普遍性质的结果所引发的一个紧迫问题。 显示神经网络模型稳健性的一个方法就是其利普西茨常量,它形成一个稳健性证书。 一个自然的问题就是,对于固定的模型类(如神经网络)和大小的数据集来说,在所有适合数据集的模型中,最最小的可实现的利普西茨常量是什么? 最近, (Bubeck等人,2020年), 深处预测,当使用配有美元神经元的双层网络来适应通用重数据集时,最小的利普西茨常量常量常量常量是$\\ 欧姆希茨常量。 这意味着,每个数据点需要一个神经神经元的固定性,在这个工作中,我们对于任何任意型模型级具有约束性的复杂性的利普西茨常量常量常量的常量常量常量常量常量的常量的常量的常量的常量网络, 与我们两个固定性网络的固定性常量的惯性常量的常量的惯性网络的常量的常量的常量的测测量的假设之下, 。