Machine learning with deep neural networks (DNNs) has become one of the foundation techniques in many safety-critical systems, such as autonomous vehicles and medical diagnosis systems. DNN-based systems, however, are known to be vulnerable to adversarial examples (AEs) that are maliciously perturbed variants of legitimate inputs. While there has been a vast body of research to defend against AE attacks in the literature, the performances of existing defense techniques are still far from satisfactory, especially for adaptive attacks, wherein attackers are knowledgeable about the defense mechanisms and craft AEs accordingly. In this work, we propose a multilayer defense-in-depth framework for AE detection, namely MixDefense. For the first layer, we focus on those AEs with large perturbations. We propose to leverage the `noise' features extracted from the inputs to discover the statistical difference between natural images and tampered ones for AE detection. For AEs with small perturbations, the inference result of such inputs would largely deviate from their semantic information. Consequently, we propose a novel learning-based solution to model such contradictions for AE detection. Both layers are resilient to adaptive attacks because there do not exist gradient propagation paths for AE generation. Experimental results with various AE attack methods on image classification datasets show that the proposed MixDefense solution outperforms the existing AE detection techniques by a considerable margin.
翻译:与深层神经网络(DNNS)一起学习深层神经网络(DNNS)的机器已成为许多安全关键系统中的基础技术之一,例如自主车辆和医疗诊断系统。但是,据知DNN的系统很容易受到敌对例子(AEs)的伤害,这些例子是恶意干扰合法投入的变体。虽然在文献中有大量研究来防范AE攻击,但现有防御技术的性能仍然远远不能令人满意,特别是在适应性攻击方面,攻击者了解防御机制,并因此设计AE系统。在这项工作中,我们提议建立一个多层防御深度的AE探测框架,即MixDefence。在第一层,我们把重点放在那些具有大扰动性、具有敌意的AE系统。我们提议利用从投入中提取的“噪音”特征来发现自然图像与AE系统探测的被篡改的图像之间的统计差异。对于适应性攻击,这种投入的推断结果将在很大程度上偏离其语义信息分类。因此,我们提议一个基于新学习的ADixDefro development roal E roal a developing rodistration the developing the rodistration A development a degresgress the the squts the degrestiquestation a degresmstrational a degregy a drovilational a drogres a droutes.