Deep neural networks have become an integral part of our software infrastructure and are being deployed in many widely-used and safety-critical applications. However, their integration into many systems also brings with it the vulnerability to test time attacks in the form of Universal Adversarial Perturbations (UAPs). UAPs are a class of perturbations that when applied to any input causes model misclassification. Although there is an ongoing effort to defend models against these adversarial attacks, it is often difficult to reconcile the trade-offs in model accuracy and robustness to adversarial attacks. Jacobian regularization has been shown to improve the robustness of models against UAPs, whilst model ensembles have been widely adopted to improve both predictive performance and model robustness. In this work, we propose a novel approach, Jacobian Ensembles-a combination of Jacobian regularization and model ensembles to significantly increase the robustness against UAPs whilst maintaining or improving model accuracy. Our results show that Jacobian Ensembles achieves previously unseen levels of accuracy and robustness, greatly improving over previous methods that tend to skew towards only either accuracy or robustness.
翻译:深神经网络已成为我们软件基础设施的一个组成部分,并且正在许多广泛使用和安全关键应用程序中部署。然而,将深神经网络纳入许多系统也带来了以通用反逆干扰(UAPs)形式测试时间攻击的脆弱性。UAPs是一系列扰动,在应用到任何输入导致模型分类错误时,它是一种扰动。虽然正在不断努力保护模型,以对付这些对抗性攻击,但往往难以调和模型准确性和稳健性与对抗性攻击的权衡。Jacobian的正规化已证明可以改善针对UAPs的模型的稳健性,而模型组合已被广泛采用,以提高预测性业绩和模型稳健性。在这项工作中,我们提出了一种新颖的方法,即Jacobian Ensembles组合组合和模型组合,以大大增强对UAPs的稳健性,同时保持或提高模型准确性。我们的结果显示,Jacobian Ensembles实现了先前所见的准确性和稳健性水平,大大改进了以往方法,但往往只能达到准确性或稳健性。