Adversarial training is the de facto most promising defense against adversarial examples. Yet, its passive nature inevitably prevents it from being immune to unknown attackers. To achieve a proactive defense, we need a more fundamental understanding of adversarial examples, beyond the popular bounded threat model. In this paper, we provide a causal viewpoint of adversarial vulnerability: the cause is the spurious correlation ubiquitously existing in learning, i.e., the confounding effect, where attackers are precisely exploiting these effects. Therefore, a fundamental solution for adversarial robustness is by causal intervention. As these visual confounders are imperceptible in general, we propose to use the instrumental variable that achieves causal intervention without the need for confounder observation. We term our robust training method as Causal intervention by instrumental Variable (CiiV). It's a causal regularization that 1) augments the image with multiple retinotopic centers and 2) encourages the model to learn causal features, rather than local confounding patterns, by favoring features linearly responding to spatial interpolations. Extensive experiments on a wide spectrum of attackers and settings applied in CIFAR-10, CIFAR-100, and mini-ImageNet demonstrate that CiiV is robust to adaptive attacks, including the recent AutoAttack. Besides, as a general causal regularization, it can be easily plugged into other methods to further boost the robustness.
翻译:Adversarial 培训是事实上最有希望的对抗性例子的防御。然而,它的被动性质必然会妨碍它不受身份不明攻击者的影响。为了实现积极主动的防御,我们需要更根本地理解对抗性例子,超越受欢迎的受约束威胁模式。在本文中,我们提供了对敌对脆弱性的因果关系观点:原因是在学习中存在着虚假的因果关系,即混结效应,攻击者正是在利用这些效应。因此,对对抗性强力的一种基本解决办法是因果干预。由于这些视觉混淆者一般难以察觉,我们提议使用工具变量,实现因果干预,而不必进行安慰式观察。我们将我们强健的培训方法称为工具变量(CiiVi ViV)的Causal干预。这是一个因果规范,1)用多重矩形中心来强化形象,2 鼓励模型学习因果关系特征,而不是地方的凝固模式,方法是通过直线反应对空间互换的特征。在CFARFAR-10号中应用的大规模攻击和(包括CFAR-10号C-10号)的升级性攻击的大规模实验,可以显示CFAR-Sirstital-stal-stitalstitalstitutal vidudustrislation)。