Randomized smoothing is a recently proposed defense against adversarial attacks that has achieved SOTA provable robustness against $\ell_2$ perturbations. A number of publications have extended the guarantees to other metrics, such as $\ell_1$ or $\ell_\infty$, by using different smoothing measures. Although the current framework has been shown to yield near-optimal $\ell_p$ radii, the total safety region certified by the current framework can be arbitrarily small compared to the optimal. In this work, we propose a framework to improve the certified safety region for these smoothed classifiers without changing the underlying smoothing scheme. The theoretical contributions are as follows: 1) We generalize the certification for randomized smoothing by reformulating certified radius calculation as a nested optimization problem over a class of functions. 2) We provide a method to calculate the certified safety region using $0^{th}$-order and $1^{st}$-order information for Gaussian-smoothed classifiers. We also provide a framework that generalizes the calculation for certification using higher-order information. 3) We design efficient, high-confidence estimators for the relevant statistics of the first-order information. Combining the theoretical contribution 2) and 3) allows us to certify safety region that are significantly larger than the ones provided by the current methods. On CIFAR10 and Imagenet datasets, the new regions certified by our approach achieve significant improvements on general $\ell_1$ certified radii and on the $\ell_2$ certified radii for color-space attacks ($\ell_2$ restricted to 1 channel) while also achieving smaller improvements on the general $\ell_2$ certified radii. Our framework can also provide a way to circumvent the current impossibility results on achieving higher magnitude of certified radii without requiring the use of data-dependent smoothing techniques.
翻译:随机平滑是最近针对对抗性攻击的一种拟议防御,它已经实现了SOTA的可检测到的稳健性,而不是$2美元。一些出版物通过使用不同的平滑措施,将保障扩大到其他指标,如$_1美元或$@ell ⁇ infty$。虽然当前的框架显示能够产生接近最佳的$@ell_p$ radi,但当前框架认证的总安全区域可以任意小于最佳。在这项工作中,我们还提出了一个框架,用以改进这些平滑的分类器的认证安全区域,而不必改变基本顺滑计划。理论贡献如下:1)我们通过重新配置认证半径计算作为某类功能的固定优化问题,将随机平滑的认证普遍化认证区域范围扩大到$2美元;2)我们提供了一个方法来计算认证安全区域,用美元平滑滑动的方法,用美元平滑动的平滑动式平滑动式平滑动的平滑动分类系统。2)我们还提供了一个框架,用更低的平滑的平滑式平滑式平滑式平滑式平整的平滑式平滑式平整的平整式平整的平整式平整式平整式平整式平整式平整式平整式平整式平整式平整式平整式平整式平整式平整式平整式平整式平整式平整式平整式平整式平整式平整式平整。3,用的是我们的平整式平整式平整式平坦式平坦式平坦式平坦式平坦式平坦式平整式平整式平整式平时的平时的平时的平时的平时的平时的平时的平时,我们的平时的平坦式平时的平时的平时的平时的平时的平时的平时的平时的平时的平时的平时,我们的平时的平时的平时的平时的平时的平时的平时的平时的平时的平时的平时的平时的平时的平时的平时的平时的平时的平时的平的平的平时的平的平的平的平的平的平的平的平的平的平的平