Randomized smoothing is the dominant standard for provable defenses against adversarial examples. Nevertheless, this method has recently been proven to suffer from important information theoretic limitations. In this paper, we argue that these limitations are not intrinsic, but merely a byproduct of current certification methods. We first show that these certificates use too little information about the classifier, and are in particular blind to the local curvature of the decision boundary. This leads to severely sub-optimal robustness guarantees as the dimension of the problem increases. We then show that it is theoretically possible to bypass this issue by collecting more information about the classifier. More precisely, we show that it is possible to approximate the optimal certificate with arbitrary precision, by probing the decision boundary with several noise distributions. Since this process is executed at certification time rather than at test time, it entails no loss in natural accuracy while enhancing the quality of the certificates. This result fosters further research on classifier-specific certification and demonstrates that randomized smoothing is still worth investigating. Although classifier-specific certification may induce more computational cost, we also provide some theoretical insight on how to mitigate it.
翻译:随机的平滑是针对对抗性实例的可辨识防御的主要标准。 然而,这一方法最近被证明受到重要的信息理论限制。 在本文中,我们争辩说,这些限制并非内在的,而只是当前验证方法的副产品。 我们首先表明,这些证书对分类器的信息使用太少,特别是无视决定边界的本地曲线。这导致随着问题范围的增加,高度次优的稳健性保障。然后我们表明,通过收集更多关于分类器的信息,在理论上是有可能绕过这一问题的。更准确地说,我们表明,通过用几种噪音分布来检验决定边界,可以任意地将最佳证书加以近似。由于这个过程是在认证时间而不是在测试时间进行的,因此在提高证书质量的同时不会造成自然准确性损失。这有利于进一步研究分类器特定验证,并表明随机化的平滑仍值得调查。尽管分类器特定认证可能会引起更多的计算成本,但我们也提供了如何减轻成本的理论见解。