实现基于网络解释的不易察觉的反反向图像补丁 (Towards Imperceptible Adversarial Image Patches Based on Network Explanations)

The vulnerability of deep neural networks (DNNs) for adversarial examples have attracted more attention. Many algorithms are proposed to craft powerful adversarial examples. However, these algorithms modifying the global or local region of pixels without taking into account network explanations. Hence, the perturbations are redundancy and easily detected by human eyes. In this paper, we propose a novel method to generate local region perturbations. The main idea is to find the contributing feature regions (CFRs) of images based on network explanations for perturbations. Due to the network explanations, the perturbations added to the CFRs are more effective than other regions. In our method, a soft mask matrix is designed to represent the CFRs for finely characterizing the contributions of each pixel. Based on this soft mask, we develop a new objective function with inverse temperature to search for optimal perturbations in CFRs. Extensive experiments are conducted on CIFAR-10 and ILSVRC2012, which demonstrate the effectiveness, including attack success rate, imperceptibility,and transferability.

翻译：深度神经网络(DNNs)对于对抗性实例的脆弱性引起了更多的注意,许多算法都建议设计强大的对抗性实例,然而,这些算法在不考虑网络解释的情况下改变了全球或当地象素区域,因此,扰动是冗余的,很容易被人类眼睛探测到。在本文中,我们提出了一个产生局部区域扰动的新方法。主要目的是根据网络对扰动的解释找到图像的生成特征区域。由于网络解释,CFR增加的扰动比其他区域更有效。在我们的方法中,一个软面罩矩阵旨在代表CFRs对每个象素贡献的精细定性。基于这一软面罩,我们开发了一个新的目标功能,温度不高,以寻找CFRs的最佳扰动。在CIFAR-10和ILSVRC-2012上进行了广泛的实验,这些实验显示了有效性,包括攻击成功率、不可视性和可转移性。

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【KDD2020】更深的图神经网络，Towards Deeper Graph Neural Networks

专知会员服务

90+阅读 · 2020年7月22日

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

最新《生成式对抗网络》简介，25页ppt

专知会员服务

175+阅读 · 2020年6月28日

生成式对抗网络先验贝叶斯推断，Bayesian Inference with Generative Adversarial Network Priors

专知会员服务

28+阅读 · 2020年2月18日