It is desirable, and often a necessity, for machine learning models to be robust against adversarial attacks. This is particularly true for Bayesian models, as they are well-suited for safety-critical applications, in which adversarial attacks can have catastrophic outcomes. In this work, we take a deeper look at the adversarial robustness of Bayesian Neural Networks (BNNs). In particular, we consider whether the adversarial robustness of a BNN can be increased by model choices, particularly the Lipschitz continuity induced by the prior. Conducting in-depth analysis on the case of i.i.d., zero-mean Gaussian priors and posteriors approximated via mean-field variational inference, we find evidence that adversarial robustness is indeed sensitive to the prior variance.
翻译:机器学习模式对于对抗性攻击是可取的,而且往往也是必要的。对于贝耶斯模式来说尤其如此,因为这些模式非常适合安全关键应用,在这些模式中,对抗性攻击可能产生灾难性后果。在这项工作中,我们更深入地审视巴耶斯神经网络(BNNs)的对抗性强力。特别是,我们考虑是否可以通过模式选择,特别是先前的Lipschitz连续性,来提高BNN的对抗性强力。对i.i.d.d.、零中度高斯前科和近似于中场变异推论的后代科进行深入分析,我们发现有证据表明,对抗性强力的确能适应先前的差异。