Concentration of measure has been argued to be the fundamental cause of adversarial vulnerability. Mahloujifar et al. presented an empirical way to measure the concentration of a data distribution using samples, and employed it to find lower bounds on intrinsic robustness for several benchmark datasets. However, it remains unclear whether these lower bounds are tight enough to provide a useful approximation for the intrinsic robustness of a dataset. To gain a deeper understanding of the concentration of measure phenomenon, we first extend the Gaussian Isoperimetric Inequality to non-spherical Gaussian measures and arbitrary $\ell_p$-norms ($p \geq 2$). We leverage these theoretical insights to design a method that uses half-spaces to estimate the concentration of any empirical dataset under $\ell_p$-norm distance metrics. Our proposed algorithm is more efficient than Mahloujifar et al.'s, and our experiments on synthetic datasets and image benchmarks demonstrate that it is able to find much tighter intrinsic robustness bounds. These tighter estimates provide further evidence that rules out intrinsic dataset concentration as a possible explanation for the adversarial vulnerability of state-of-the-art classifiers.
翻译:衡量的集中度被认为是对抗性脆弱性的基本原因。Mahloujifar等人介绍了一种经验性方法,用来测量使用样本的数据分配的集中度,并用它来找出关于若干基准数据集内在稳健性的较低界限。然而,这些较低界限是否足够紧,足以为数据集内在稳健性提供有用的近似值,目前尚不清楚。为了更深入地了解测量现象的集中度,我们首先将Gaussian同位素测量偏差扩大到非球性高斯测量测量测量测量措施和任意的$@ell_p$-norms ($p\geq 2$) 。我们利用这些理论见解来设计一种方法,使用半空空间来估计在$>p$-norm距离测量仪下的任何经验数据集的集中程度。我们提议的算法比Mahlooujifar等人更有效率,我们在合成数据集和图像基准方面的实验表明,它能够找到更紧密的内在稳健性约束。这些比较密切的估计提供了进一步的证据,表明将内在数据设定的集中度排除内在数据集中度,作为可能解释国家敌对脆弱性的分类。