Adversarial examples are often cited by neuroscientists and machine learning researchers as an example of how computational models diverge from biological sensory systems. Recent work has proposed adding biologically-inspired components to visual neural networks as a way to improve their adversarial robustness. One surprisingly effective component for reducing adversarial vulnerability is response stochasticity, like that exhibited by biological neurons. Here, using recently developed geometrical techniques from computational neuroscience, we investigate how adversarial perturbations influence the internal representations of standard, adversarially trained, and biologically-inspired stochastic networks. We find distinct geometric signatures for each type of network, revealing different mechanisms for achieving robust representations. Next, we generalize these results to the auditory domain, showing that neural stochasticity also makes auditory models more robust to adversarial perturbations. Geometric analysis of the stochastic networks reveals overlap between representations of clean and adversarially perturbed stimuli, and quantitatively demonstrates that competing geometric effects of stochasticity mediate a tradeoff between adversarial and clean performance. Our results shed light on the strategies of robust perception utilized by adversarially trained and stochastic networks, and help explain how stochasticity may be beneficial to machine and biological computation.
翻译:神经科学家和机器学习研究人员经常引用反对立实例,作为计算模型与生物感官系统不同之处的一个例子。最近的工作提议在视觉神经网络中增加生物启发组件,作为改善其对抗性强力的一种方法。减少对抗性脆弱性的一个令人惊讶的有效组成部分是反应的随机性,如生物神经元所展示的那样。在这里,利用最近开发的计算神经科学的几何学技术,我们调查对立性扰动如何影响标准、对立性训练、生物启发性对立网络的内部表述。我们为每一种网络找到不同的几何特征,揭示实现强力表现的不同机制。接下来,我们将这些结果推广到听力领域,表明神经对立性判断性也使听力模型对对抗性振动性冲击性表现更加有力。对测能网络的几何性分析显示,对清洁和对立性渗透性刺激性反应的描述,以及定量显示,对辨性介质性介质性介质介质的相互竞争的几何影响,对敌对性和经过训练的机性判断性判断性判断性判断性判断性判断性判断性判断性能进行如何利用。我们的光光向机性判断性判断性判断性判断性判断,我们通过对正对正和清洁性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性判断性分析的结果。