In this paper, we propose a new framework to detect adversarial examples motivated by the observations that random components can improve the smoothness of predictors and make it easier to simulate output distribution of deep neural network. With these observations, we propose a novel Bayesian adversarial example detector, short for BATer, to improve the performance of adversarial example detection. In specific, we study the distributional difference of hidden layer output between natural and adversarial examples, and propose to use the randomness of Bayesian neural network (BNN) to simulate hidden layer output distribution and leverage the distribution dispersion to detect adversarial examples. The advantage of BNN is that the output is stochastic while neural networks without random components do not have such characteristics. Empirical results on several benchmark datasets against popular attacks show that the proposed BATer outperforms the state-of-the-art detectors in adversarial example detection.
翻译:在本文中,我们提出了一个新的框架来检测由随机组件能够改善预测器的顺利性并使模拟深神经网络输出分布更加容易的观测结果所激发的对抗性实例。通过这些观测,我们建议了一个新的框架来检测由随机组件能够提高预测器的顺利性并使模拟深神经网络的输出分布更为容易的对抗性实例。我们建议了一个新的贝叶西亚对抗性范例检测器(BATER简称BATER),以提高对抗性实例检测的性能。具体地说,我们研究了自然和对抗性实例之间隐藏层输出的分布性差异,并提议使用巴伊西亚神经网络(BNNN)的随机性来模拟隐性层输出分布,并利用分布性分布性分布来检测对抗性实例。 BNNN的优势在于该输出是随机性的,而没有随机组件的神经网络则没有这样的特征。 几个针对大众攻击的基准数据集的经验显示,拟议的BATER在对抗性示例探测中超越了最先进的探测器。