利用有限信息和数据检测后门袭击黑盒 (Black-box Detection of Backdoor Attacks with Limited Information and Data)

Although deep neural networks (DNNs) have made rapid progress in recent years, they are vulnerable in adversarial environments. A malicious backdoor could be embedded in a model by poisoning the training dataset, whose intention is to make the infected model give wrong predictions during inference when the specific trigger appears. To mitigate the potential threats of backdoor attacks, various backdoor detection and defense methods have been proposed. However, the existing techniques usually require the poisoned training data or access to the white-box model, which is commonly unavailable in practice. In this paper, we propose a black-box backdoor detection (B3D) method to identify backdoor attacks with only query access to the model. We introduce a gradient-free optimization algorithm to reverse-engineer the potential trigger for each class, which helps to reveal the existence of backdoor attacks. In addition to backdoor detection, we also propose a simple strategy for reliable predictions using the identified backdoored models. Extensive experiments on hundreds of DNN models trained on several datasets corroborate the effectiveness of our method under the black-box setting against various backdoor attacks.

翻译：尽管近年来深神经网络(DNNs)取得了快速进展,但它们在对抗环境中很脆弱。恶意后门可以通过毒害培训数据集嵌入模型,该数据集的意图是让受感染模型在特定触发物出现时作出错误的预测。为了减轻后门攻击的潜在威胁,提出了各种后门探测和防御方法。然而,现有技术通常需要有毒的培训数据或进入白箱模型,而后者在实践中通常无法使用。在本文中,我们提议采用黑箱后门探测(B3D)方法,以识别后门攻击,只有查询该模型。我们采用了一种无梯度优化算法,以反向设计每类攻击的潜在触发物,这有助于揭示后门攻击的存在。除了后门探测之外,我们还提出了使用已查明的后门模型进行可靠预测的简单战略。在几套数据集培训的数百个DNN模型进行了广泛的实验,证实了我们用黑箱设置对付各种后门攻击的方法的有效性。

相关内容

黑盒

关注 1

在科学，计算和工程学中，黑盒是一种设备，系统或对象，可以根据其输入和输出（或传输特性）对其进行查看，而无需对其内部工作有任何了解。它的实现是“不透明的”（黑色）。几乎任何事物都可以被称为黑盒：晶体管，引擎，算法，人脑，机构或政府。为了使用典型的“黑匣子方法”来分析建模为开放系统的事物，仅考虑刺激/响应的行为，以推断（未知）盒子。该黑匣子系统的通常表示形式是在该方框中居中的数据流程图。黑盒的对立面是一个内部组件或逻辑可用于检查的系统，通常将其称为白盒（有时也称为“透明盒”或“玻璃盒”）。

不可错过！UIUC最新《对抗机器学习》课程，附PPT

专知会员服务

35+阅读 · 2020年12月28日

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日