DI-AA: 愚弄深神经网络的可解释白箱攻击 (DI-AA: An Interpretable White-box Attack for Fooling Deep Neural Networks)

White-box Adversarial Example (AE) attacks towards Deep Neural Networks (DNNs) have a more powerful destructive capacity than black-box AE attacks in the fields of AE strategies. However, almost all the white-box approaches lack interpretation from the point of view of DNNs. That is, adversaries did not investigate the attacks from the perspective of interpretable features, and few of these approaches considered what features the DNN actually learns. In this paper, we propose an interpretable white-box AE attack approach, DI-AA, which explores the application of the interpretable approach of the deep Taylor decomposition in the selection of the most contributing features and adopts the Lagrangian relaxation optimization of the logit output and L_p norm to further decrease the perturbation. We compare DI-AA with six baseline attacks (including the state-of-the-art attack AutoAttack) on three datasets. Experimental results reveal that our proposed approach can 1) attack non-robust models with comparatively low perturbation, where the perturbation is closer to or lower than the AutoAttack approach; 2) break the TRADES adversarial training models with the highest success rate; 3) the generated AE can reduce the robust accuracy of the robust black-box models by 16% to 31% in the black-box transfer attack.

翻译：白箱 Adversarial 示例(AE) 对深神经网络(DNN)的袭击比对AE战略领域的黑盒 AE袭击更具强大的破坏能力。然而,几乎所有的白色框方法都缺乏从DNN的角度来看的解释。也就是说, 对手没有从可解释的特征的角度来调查这些袭击, 而这些方法中很少考虑DNN实际学习什么特征。在本文中, 我们提议了一种可解释的白箱 AE袭击方法, DI-AAA, 探索了在选择最大贡献功能时采用深泰勒拆解的可解释方法, 并采用了对日志输出和L_p 规范的拉格朗式放松优化以进一步减少扰动。我们把DI-AA与三个数据集的六次基线攻击(包括最新攻击自动包)相比较。实验结果显示, 我们的拟议方法可以1) 攻击非破坏型模型, 其渗透性相对较低, 最能贡献性特性的黑箱操作率接近于最坚固的对称的AAAA-A-A-A-A-A-A-A-A-A-A-B-B-B-B-B-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C- C- C- C- C-C-C- 成功率率方法比最高成功率率降低了B-B-B-C-BAR-C-B-C-C-C-C-C-C-C-C-C-C-C-B-C-C-C-C-C-C-C-C-B-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-B-B-C-B-B-B-C-C-C-C-C-B-C-C-C-C-C-C-

相关内容

白盒

关注 0

白盒测试（也称为透明盒测试，玻璃盒测试，透明盒测试和结构测试）是一种软件测试方法，用于测试应用程序的内部结构或功能，而不是其功能（即黑盒测试）。在白盒测试中，系统的内部视角以及编程技能被用来设计测试用例。测试人员选择输入以遍历代码的路径并确定预期的输出。这类似于测试电路中的节点，在线测试（ICT）。白盒测试可以应用于软件测试过程的单元，集成和系统级别。尽管传统的测试人员倾向于将白盒测试视为在单元级别进行的，但如今它已越来越频繁地用于集成和系统测试。它可以测试单元内的路径，集成期间单元之间的路径以及系统级测试期间子系统之间的路径。

近期必读的六篇AAAI 2021【对抗攻击（Adversarial Attack）】相关论文和代码

专知会员服务

55+阅读 · 2021年2月17日

ICLR2021放榜了！ 687篇入选34篇得满分！ 48篇orals，108篇spotlights，531篇poster

专知会员服务

24+阅读 · 2021年1月13日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【贝叶斯深度学习：一种基于模型的可解释方法】Bayesian deep learning: A model-based interpretable approach

专知会员服务

49+阅读 · 2020年1月1日