Recent advances in adversarial machine learning have shown that defenses considered to be robust are actually susceptible to adversarial attacks which are specifically tailored to target their weaknesses. These defenses include Barrage of Random Transforms (BaRT), Friendly Adversarial Training (FAT), Trash is Treasure (TiT) and ensemble models made up of Vision Transformers (ViTs), Big Transfer models and Spiking Neural Networks (SNNs). A natural question arises: how can one best leverage a combination of adversarial defenses to thwart such attacks? In this paper, we provide a game-theoretic framework for ensemble adversarial attacks and defenses which answers this question. In addition to our framework we produce the first adversarial defense transferability study to further motivate a need for combinational defenses utilizing a diverse set of defense architectures. Our framework is called Game theoretic Mixed Experts (GaME) and is designed to find the Mixed-Nash strategy for a defender when facing an attacker employing compositional adversarial attacks. We show that this framework creates an ensemble of defenses with greater robustness than multiple state-of-the-art, single-model defenses in addition to combinational defenses with uniform probability distributions. Overall, our framework and analyses advance the field of adversarial machine learning by yielding new insights into compositional attack and defense formulations.
翻译:对抗性机器学习的最新进展表明,被视为强健的防御实际上很容易受到针对其弱点而专门设计的对抗性攻击。这些防御包括随机变换(BART)、友好反反向训练(FAT)、TRash是宝藏(TiT)和由愿景变换(ViTs)、大转移模型和震荡神经网络组成的混合模型(SNNs)组成的混合模型。自然产生的一个问题是:如何最佳地利用对抗性防御组合来挫败这种攻击?在本文中,我们为联合对抗性攻击和应对这一问题的防御提供了一个游戏理论框架。除了我们的框架外,我们还进行了第一次对抗性防御可转移性研究,以进一步激发需要利用多种防御结构的组合防御。我们的框架称为游戏理论混合专家(GaME),目的是在面对攻击性攻击者时如何找到混合-纳什战略来挫败这种攻击性攻击?我们展示了这个框架,在防御联合攻击性对抗性攻击性攻击性攻击性攻击性和防御攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性