Patch adversarial attacks on images, in which the attacker can distort pixels within a region of bounded size, are an important threat model since they provide a quantitative model for physical adversarial attacks. In this paper, we introduce a certifiable defense against patch attacks that guarantees for a given image and patch attack size, no patch adversarial examples exist. Our method is related to the broad class of randomized smoothing robustness schemes which provide high-confidence probabilistic robustness certificates. By exploiting the fact that patch attacks are more constrained than general sparse attacks, we derive meaningfully large robustness certificates against them. Additionally, in contrast to smoothing-based defenses against L_p and sparse attacks, our defense method against patch attacks is de-randomized, yielding improved, deterministic certificates. Compared to the existing patch certification method proposed by Chiang et al. (2020), which relies on interval bound propagation, our method can be trained significantly faster, achieves high clean and certified robust accuracy on CIFAR-10, and provides certificates at ImageNet scale. For example, for a 5-by-5 patch attack on CIFAR-10, our method achieves up to around 57.6% certified accuracy (with a classifier with around 83.8% clean accuracy), compared to at most 30.3% certified accuracy for the existing method (with a classifier with around 47.8% clean accuracy). Our results effectively establish a new state-of-the-art of certifiable defense against patch attacks on CIFAR-10 and ImageNet. Code is available at https://github.com/alevine0/patchSmoothing.
翻译:对图像的对抗性攻击是一个重要的威胁模式,攻击者可以在这种攻击中扭曲一个受约束区域范围内的像素,这是一个重要的威胁模式,因为攻击者可以在这种攻击中扭曲一个受约束区域内的像素。在本文中,我们引入了一种可以验证的防线,以防范补丁攻击,保证给定图像和补丁攻击的大小,没有补丁对抗性例子存在。我们的方法与广泛的随机整洁稳健计划类别有关,这种计划提供了高信任度的稳健度证明。我们利用补丁攻击比一般的稀释攻击更受约束这一事实,我们获得了相当大的强力证明。此外,与针对L_p和稀少攻击的平滑动防御相比,我们对付补丁攻击的防制方法已经解密,产生了更好的确定性证书。 与清一等人(2020年)提出的现有的补丁整洁性稳妥性计划相比,我们的方法可以大大加快,在CIFAR-10上实现高度清洁和经认证的准确性,并在图像网络规模上提供证书。例如对CIFAR-10的5比平坦性防御攻击进行补式攻击,我们现有的精确度为准确度,在大约38的准确度上,有最清洁的CRIL-8.的精确度,在现有的方法上,在大约的精确度为直定的精确度上,在大约的精确度为38度上,在大约38的精确度为38到可以认证的精确度。