The current state-of-the-art defense methods against adversarial examples typically focus on improving either empirical or certified robustness. Among them, adversarially trained (AT) models produce empirical state-of-the-art defense against adversarial examples without providing any robustness guarantees for large classifiers or higher-dimensional inputs. In contrast, existing randomized smoothing based models achieve state-of-the-art certified robustness while significantly degrading the empirical robustness against adversarial examples. In this paper, we propose a novel method, called \emph{Certification through Adaptation}, that transforms an AT model into a randomized smoothing classifier during inference to provide certified robustness for $\ell_2$ norm without affecting their empirical robustness against adversarial attacks. We also propose \emph{Auto-Noise} technique that efficiently approximates the appropriate noise levels to flexibly certify the test examples using randomized smoothing technique. Our proposed \emph{Certification through Adaptation} with \emph{Auto-Noise} technique achieves an \textit{average certified radius (ACR) scores} up to $1.102$ and $1.148$ respectively for CIFAR-10 and ImageNet datasets using AT models without affecting their empirical robustness or benign accuracy. Therefore, our paper is a step towards bridging the gap between the empirical and certified robustness against adversarial examples by achieving both using the same classifier.
翻译:相对而言,现有的随机平滑型模型实现了最先进的稳健性,同时大大降低了对对抗性实例的经验稳健性。在本文中,我们提出了一种新颖的方法,称为“通过适应实现的对称化”,将AT模型转换成随机化的平滑分类器,在推断期间,将AT模型转化为随机化的平滑型分类器,为$\ell_2美元的标准提供经认证的稳健性,而不影响其对大型分类师或更高层面的投入提供任何稳健性保证。我们还建议采用“emph{Aut-Noise} 技术,高效地接近适当的噪音水平,以便使用随机化的平滑动技术对测试示例进行灵活认证。我们提议的“通过适应实现对称正统性 ” 和“自动诺比 ” 技术在判断性上都实现了平均平整价的稳健性分类。