防御边界防御黑盒对立攻击 (Boundary Defense Against Black-box Adversarial Attacks)

Black-box adversarial attacks generate adversarial samples via iterative optimizations using repeated queries. Defending deep neural networks against such attacks has been challenging. In this paper, we propose an efficient Boundary Defense (BD) method which mitigates black-box attacks by exploiting the fact that the adversarial optimizations often need samples on the classification boundary. Our method detects the boundary samples as those with low classification confidence and adds white Gaussian noise to their logits. The method's impact on the deep network's classification accuracy is analyzed theoretically. Extensive experiments are conducted and the results show that the BD method can reliably defend against both soft and hard label black-box attacks. It outperforms a list of existing defense methods. For IMAGENET models, by adding zero-mean white Gaussian noise with standard deviation 0.1 to logits when the classification confidence is less than 0.3, the defense reduces the attack success rate to almost 0 while limiting the classification accuracy degradation to around 1 percent.

翻译：黑盒对抗性攻击通过反复查询的迭代优化生成对抗性样本。保护深神经网络抵御这种攻击一直是一项挑战。在本文中, 我们提出一个高效的边界防御( BD) 方法, 通过利用对抗性优化往往需要分类边界上的样本来减轻黑盒攻击。我们的方法检测边界样本, 因为它的分类信任度低, 并在记录中添加白色高斯语噪音。方法对深网络分类精确度的影响是理论上的分析。进行了广泛的实验, 结果显示 BD 方法可以可靠地抵御软标签和硬标签黑盒攻击。它比现有防御方法清单要强。对于IMAGENET 模型, 在分类信任度小于0. 3时, 将标准偏差0. 1 的白高斯语添加到登录点, 将攻击成功率降低到近0, 同时将分类精确度降解率限制在1%左右。

相关内容

黑盒

关注 1

在科学，计算和工程学中，黑盒是一种设备，系统或对象，可以根据其输入和输出（或传输特性）对其进行查看，而无需对其内部工作有任何了解。它的实现是“不透明的”（黑色）。几乎任何事物都可以被称为黑盒：晶体管，引擎，算法，人脑，机构或政府。为了使用典型的“黑匣子方法”来分析建模为开放系统的事物，仅考虑刺激/响应的行为，以推断（未知）盒子。该黑匣子系统的通常表示形式是在该方框中居中的数据流程图。黑盒的对立面是一个内部组件或逻辑可用于检查的系统，通常将其称为白盒（有时也称为“透明盒”或“玻璃盒”）。

【CVPR 2022】可转移的稀疏对抗性攻击，Transferable Sparse Adversarial Attack

专知会员服务

15+阅读 · 2022年3月12日

近期必读的六篇AAAI 2021【对抗攻击（Adversarial Attack）】相关论文和代码

专知会员服务

55+阅读 · 2021年2月17日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日