HAS-Nets:为数据收集设想方案保护DNN免受后门攻击的治疗和选择机制 (HaS-Nets: A Heal and Select Mechanism to Defend DNNs Against Backdoor Attacks for Data Collection Scenarios)

We have witnessed the continuing arms race between backdoor attacks and the corresponding defense strategies on Deep Neural Networks (DNNs). Most state-of-the-art defenses rely on the statistical sanitization of the "inputs" or "latent DNN representations" to capture trojan behaviour. In this paper, we first challenge the robustness of such recently reported defenses by introducing a novel variant of targeted backdoor attack, called "low-confidence backdoor attack". We also propose a novel defense technique, called "HaS-Nets". "Low-confidence backdoor attack" exploits the confidence labels assigned to poisoned training samples by giving low values to hide their presence from the defender, both during training and inference. We evaluate the attack against four state-of-the-art defense methods, viz., STRIP, Gradient-Shaping, Februus and ULP-defense, and achieve Attack Success Rate (ASR) of 99%, 63.73%, 91.2% and 80%, respectively. We next present "HaS-Nets" to resist backdoor insertion in the network during training, using a reasonably small healing dataset, approximately 2% to 15% of full training data, to heal the network at each iteration. We evaluate it for different datasets - Fashion-MNIST, CIFAR-10, Consumer Complaint and Urban Sound - and network architectures - MLPs, 2D-CNNs, 1D-CNNs. Our experiments show that "HaS-Nets" can decrease ASRs from over 90% to less than 15%, independent of the dataset, attack configuration and network architecture.

翻译：我们亲眼目睹了深神经网络(DNN)内部攻击和相应的防御战略之间的持续军备竞赛。多数最先进的防御手段依靠“ 投入” 或“ 相对的 DNN 代表” 的统计净化,以捕捉Trojan行为。在本文中,我们首先通过引入名为“ 低信任幕后攻击”的新型的幕后攻击变体,挑战最近报告的这种防御手段的稳健性。我们还提出了一个叫作“ 低信任幕后攻击”的新型防御技术。 “ 低信任幕后攻击”利用了指定给有毒训练样品的保密标签,在培训和推断期间向防御者提供了低值的“ 投入” 。我们评估了对四种最先进的防御方法的袭击,例如STIP、 Gradient-Shaping、F2ruus和ULP-防御, 以及达到99%、63MM-73%、91.2%和80%的进攻成功率(ASR ) 。我们接下来的“ 网络-CN-NNNW-N-Net ” 以低值来抵制网络的后方插入网络,在大约15个网络的数据中,在培训中,在1个不同的网络中,在1个网络中,在1个网络中,在1个内部数据中, 显示1个网络中,在1个数字结构中,在1个网络中,在进行合理的数据中,在1个内部数据中,在1个网络中显示1个数据中,在1个数据中,在1个中,在1个数据中,在1个中,在1个网络中,在1个网络中,在1个中,在1个中,在进行合理的数据中,在1个中,在1个中,在1个中,在1个中,在培训中,在1个中,在进行1个中,在1个中,在1个中,在进行。

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

近期必读的六篇AAAI 2021【对抗攻击（Adversarial Attack）】相关论文和代码

专知会员服务

55+阅读 · 2021年2月17日

【ICLR2021】神经元注意力蒸馏消除DNN中的后门触发器

专知会员服务

15+阅读 · 2021年1月31日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【伯克利】黑盒机器翻译系统的模仿攻击与防御，Imitation Attacks and Defenses for Black-box Machine Translation Systems

专知会员服务

7+阅读 · 2020年5月4日