跨越数据集的对立攻击 (Adversarial Attack across Datasets)

It has been observed that Deep Neural Networks (DNNs) are vulnerable to transfer attacks in the query-free black-box setting. However, all the previous studies on transfer attack assume that the white-box surrogate models possessed by the attacker and the black-box victim models are trained on the same dataset, which means the attacker implicitly knows the label set and the input size of the victim model. However, this assumption is usually unrealistic as the attacker may not know the dataset used by the victim model, and further, the attacker needs to attack any randomly encountered images that may not come from the same dataset. Therefore, in this paper we define a new Generalized Transferable Attack (GTA) problem where we assume the attacker has a set of surrogate models trained on different datasets (with different label sets and image sizes), and none of them is equal to the dataset used by the victim model. We then propose a novel method called Image Classification Eraser (ICE) to erase classification information for any encountered images from arbitrary dataset. Extensive experiments on Cifar-10, Cifar-100, and TieredImageNet demonstrate the effectiveness of the proposed ICE on the GTA problem. Furthermore, we show that existing transfer attack methods can be modified to tackle the GTA problem, but with significantly worse performance compared with ICE.

翻译：据观察,深神经网络(DNNS)很容易在无查询的黑盒设置中转移攻击,但是,以往关于转移攻击的所有研究都假定攻击者和黑盒子受害者模型拥有的白色盒子替代模型是在同一数据集上培训的,这意味着攻击者隐含地知道标签组和受害者模型的输入大小。然而,这一假设通常不切实际,因为攻击者可能不知道受害者模型使用的数据集,而且攻击者需要随机地攻击任何可能并非来自同一数据集的图像。因此,在本文中,我们定义了一个新的通用可转移攻击模型(GTA)问题,我们假设攻击者和黑盒子受害者模型拥有一套在不同的数据集(不同标签组和图像大小)上受过训练的替代模型,而这些模型都与受害者模型使用的数据集无异。我们然后提出一种叫作图像分类Eraser(ICE)的新方法,以抹除任意数据集中遇到的任何图像的分类信息。在Cifar-10、Cifar-100上进行广泛的实验,我们定义了一个新的通用可转让攻击(GGTA)问题。我们用GREM-TA系统展示了目前变型的操作方法。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

近期必读的六篇AAAI 2021【对抗攻击（Adversarial Attack）】相关论文和代码

专知会员服务

55+阅读 · 2021年2月17日

【ICLR2021】面向词替换攻击的对抗训练方法

专知会员服务

21+阅读 · 2021年2月7日

【AAAI2021】属性引导对抗训练的自然扰动鲁棒性

专知会员服务

26+阅读 · 2021年1月21日

不可错过！UIUC最新《对抗机器学习》课程，附PPT

专知会员服务

35+阅读 · 2020年12月28日