使用黑箱差别技术生成对等输入 (Generating Adversarial Inputs Using A Black-box Differential Technique)

Neural Networks (NNs) are known to be vulnerable to adversarial attacks. A malicious agent initiates these attacks by perturbing an input into another one such that the two inputs are classified differently by the NN. In this paper, we consider a special class of adversarial examples, which can exhibit not only the weakness of NN models - as do for the typical adversarial examples - but also the different behavior between two NN models. We call them difference-inducing adversarial examples or DIAEs. Specifically, we propose DAEGEN, the first black-box differential technique for adversarial input generation. DAEGEN takes as input two NN models of the same classification problem and reports on output an adversarial example. The obtained adversarial example is a DIAE, so that it represents a point-wise difference in the input space between the two NN models. Algorithmically, DAEGEN uses a local search-based optimization algorithm to find DIAEs by iteratively perturbing an input to maximize the difference of two models on predicting the input. We conduct experiments on a spectrum of benchmark datasets (e.g., MNIST, ImageNet, and Driving) and NN models (e.g., LeNet, ResNet, Dave, and VGG). Experimental results are promising. First, we compare DAEGEN with two existing white-box differential techniques (DeepXplore and DLFuzz) and find that under the same setting, DAEGEN is 1) effective, i.e., it is the only technique that succeeds in generating attacks in all cases, 2) precise, i.e., the adversarial attacks are very likely to fool machines and humans, and 3) efficient, i.e, it requires a reasonable number of classification queries. Second, we compare DAEGEN with state-of-the-art black-box adversarial attack methods (simba and tremba), by adapting them to work on a differential setting. The experimental results show that DAEGEN performs better than both of them.

翻译：已知神经网络(NNs) 很容易受到对抗性攻击。一个恶意代理商通过对另一个输入进行干扰来启动这些攻击, 这样两个输入的 NN 分类方式不同。在本文中, 我们考虑一个特殊类别的对抗性例子, 不仅显示 NN 模型的弱点, 典型的对抗性例子也是如此, 而且两个 NN 模型之间的不同行为。我们称它们为不同的引发对抗性例子或 DIAE 。具体地说, 我们提议DAEGEN, 这是第一个用于对抗性输入的黑箱差异化技术。 DAEGEN将同一分类问题的两个 NN 模型作为输入两个NW 模型, 并报告输出的对抗性例子。获得的对抗性例子是一个 DIE 模型, 不仅显示 NNNF 模型( 典型的典型的), DI. DEGEN 使用基于搜索的优化算法来找到 DIAE 的数值, 并且通过反复的变现变换来找到两个模型的有效差异。我们在一个基数中, 的 REO- dIDO 和 DER 都要求用一个基数据、和 DIDER 的变数。

相关内容

黑盒

关注 1

在科学，计算和工程学中，黑盒是一种设备，系统或对象，可以根据其输入和输出（或传输特性）对其进行查看，而无需对其内部工作有任何了解。它的实现是“不透明的”（黑色）。几乎任何事物都可以被称为黑盒：晶体管，引擎，算法，人脑，机构或政府。为了使用典型的“黑匣子方法”来分析建模为开放系统的事物，仅考虑刺激/响应的行为，以推断（未知）盒子。该黑匣子系统的通常表示形式是在该方框中居中的数据流程图。黑盒的对立面是一个内部组件或逻辑可用于检查的系统，通常将其称为白盒（有时也称为“透明盒”或“玻璃盒”）。