Deep neural networks (DNNs) have been shown to be vulnerable against adversarial examples (AEs), which are maliciously designed to cause dramatic model output errors. In this work, we reveal that normal examples (NEs) are insensitive to the fluctuations occurring at the highly-curved region of the decision boundary, while AEs typically designed over one single domain (mostly spatial domain) exhibit exorbitant sensitivity on such fluctuations. This phenomenon motivates us to design another classifier (called dual classifier) with transformed decision boundary, which can be collaboratively used with the original classifier (called primal classifier) to detect AEs, by virtue of the sensitivity inconsistency. When comparing with the state-of-the-art algorithms based on Local Intrinsic Dimensionality (LID), Mahalanobis Distance (MD), and Feature Squeezing (FS), our proposed Sensitivity Inconsistency Detector (SID) achieves improved AE detection performance and superior generalization capabilities, especially in the challenging cases where the adversarial perturbation levels are small. Intensive experimental results on ResNet and VGG validate the superiority of the proposed SID.
翻译:事实证明,深心神经网络(DNNS)在对抗性例子(AEs)面前很脆弱,对抗对抗性例子(AEs)是恶意设计,目的是造成典型输出错误。在这项工作中,我们发现,正常例子(NES)对决定边界高度精细区域发生的波动不敏感,而通常设计在一个域(主要是空间域)上的一个域(大多是空间域)的AES通常对这种波动具有极高的敏感性。这个现象促使我们设计另一个具有改变决定界限的分类器(称为双分级器),可以与原始分类器(称为原始分类器)合作,通过敏感度的不一致来探测AEs。在比较基于局部内分光度(LID)、马哈拉诺比斯距离(MD)和地貌隔热仪(FSFS)的先进算法时,我们提议的感应感性不连贯检测仪(SID)的性能提高了AE的探测性能和超常化能力,特别是在具有挑战性的案例中,因为对抗性过敏度水平较小。关于ResNet和VGGU的强化实验结果。