反对映像中与空间相关的相关模式 (Spatially Correlated Patterns in Adversarial Images)

Adversarial attacks have proved to be the major impediment in the progress on research towards reliable machine learning solutions. Carefully crafted perturbations, imperceptible to human vision, can be added to images to force misclassification by an otherwise high performing neural network. To have a better understanding of the key contributors of such structured attacks, we searched for and studied spatially co-located patterns in the distribution of pixels in the input space. In this paper, we propose a framework for segregating and isolating regions within an input image which are particularly critical towards either classification (during inference), or adversarial vulnerability or both. We assert that during inference, the trained model looks at a specific region in the image, which we call Region of Importance (RoI); and the attacker looks at a region to alter/modify, which we call Region of Attack (RoA). The success of this approach could also be used to design a post-hoc adversarial defence method, as illustrated by our observations. This uses the notion of blocking out (we call neutralizing) that region of the image which is highly vulnerable to adversarial attacks but is not important for the task of classification. We establish the theoretical setup for formalising the process of segregation, isolation and neutralization and substantiate it through empirical analysis on standard benchmarking datasets. The findings strongly indicate that mapping features into the input space preserves the significant patterns typically observed in the feature-space while adding major interpretability and therefore simplifies potential defensive mechanisms.

翻译：事实证明,反向攻击是阻碍研究实现可靠的机器学习解决方案的主要障碍。经过精心设计的扰动,人类视觉无法察觉,可以添加到图像中,迫使一个高性能神经网络进行错误分类。为了更好地了解这种结构性攻击的主要贡献者,我们搜索并研究了输入空间中像素分布的空间共同定位模式。在本文件中,我们提出了一个在输入图像中分离和隔离区域的框架,这种输入图像对于分类(在推断期间)或对抗性脆弱性或两者都特别关键。我们断言,在推断期间,经过训练的模式着眼于图像中的特定区域,我们称之为“重要区域”(RoI);攻击者着眼于一个区域,以改变/调整,我们称之为“攻击区域”(RoA) 。这种方法的成功还可用于设计一种后合一线防御方法,因此,正如我们的观察所显示的那样,这使用了一种概念,即切断(我们称之为中性)该区域图像的屏蔽,而该区域在判断中性方面非常脆弱,我们称之为“重要”的“重要”区域模式;我们通过理论性特征分析,我们通过确定重要的空间定位分析来确定了“重要”的模型分析过程。