Modern deep neural network models suffer from adversarial examples, i.e. confidently misclassified points in the input space. It has been shown that Bayesian neural networks are a promising approach for detecting adversarial points, but careful analysis is problematic due to the complexity of these models. Recently Gilmer et al. (2018) introduced adversarial spheres, a toy set-up that simplifies both practical and theoretical analysis of the problem. In this work, we use the adversarial sphere set-up to understand the properties of approximate Bayesian inference methods for a linear model in a noiseless setting. We compare predictions of Bayesian and non-Bayesian methods, showcasing the advantages of the former, although revealing open challenges for deep learning applications.
翻译:现代深神经网络模型存在对抗性实例,即输入空间中令人信服的错误分类点,已经表明,贝叶西亚神经网络是探测对抗点的一个很有希望的方法,但由于这些模型的复杂性,仔细分析是有问题的。最近Gilmer等人(2018年)引入了对抗领域,这是一个玩具装置,简化了对这一问题的实际和理论分析。在这项工作中,我们利用对抗领域设置来理解无噪音环境中线性模型近似贝叶斯推论方法的特性。我们比较了巴伊西亚和非巴耶斯方法的预测,展示了前者的优势,尽管揭示了深层学习应用的公开挑战。