Game Theoretic Mixed Experts for Combinational Adversarial Machine Learning (Game Theoretic Mixed Experts for Combinational Adversarial Machine Learning)

Recent advances in adversarial machine learning have shown that defenses considered to be robust are actually susceptible to adversarial attacks which are specifically customized to target their weaknesses. These defenses include Barrage of Random Transforms (BaRT), Friendly Adversarial Training (FAT), Trash is Treasure (TiT) and ensemble models made up of Vision Transformers (ViTs), Big Transfer models and Spiking Neural Networks (SNNs). We first conduct a transferability analysis, to demonstrate the adversarial examples generated by customized attacks on one defense, are not often misclassified by another defense. This finding leads to two important questions. First, how can the low transferability between defenses be utilized in a game theoretic framework to improve the robustness? Second, how can an adversary within this framework develop effective multi-model attacks? In this paper, we provide a game-theoretic framework for ensemble adversarial attacks and defenses. Our framework is called Game theoretic Mixed Experts (GaME). It is designed to find the Mixed-Nash strategy for both a detector based and standard defender, when facing an attacker employing compositional adversarial attacks. We further propose three new attack algorithms, specifically designed to target defenses with randomized transformations, multi-model voting schemes, and adversarial detector architectures. These attacks serve to both strengthen defenses generated by the GaME framework and verify their robustness against unforeseen attacks. Overall, our framework and analyses advance the field of adversarial machine learning by yielding new insights into compositional attack and defense formulations.

翻译：博弈论混合专家：组合对抗机器学习近期对抗机器学习的进展表明那些认为自己很强健的防御措施实际上都容易受到专门定制的对抗攻击。这些防御措施包括随机变换(Barage of Random Transforms，BaRT)，友好对抗训练(Friendly Adversarial Training，FAT)，垃圾成宝(Trash is Treasure，TiT)以及由Vision Transformers (ViTs)、Big Transfer模型和SNN(Spiking Neural Networks)组成的集成模型。我们首先进行可迁移性分析，证明专门针对一个防御生成的对抗样本通常不会被另一个防御错误分类。这一发现引出了两个重要问题。首先，如何在博弈论框架中利用不同防御之间的低迁移性来提高强健性？其次，如何使攻击者在这个框架内开发有效的多模型攻击？在本文中，我们提出了一个博弈论框架，用于集成对抗攻击与防御。我们的框架名为博弈论混合专家(GaME)，旨在找到检测器/传统防御者面对执行组合式对抗攻击的攻击者的混合纳什策略。我们进一步提出了三种新的攻击算法，针对随机变换、多模型投票方案和对抗检测器架构的防御进行专门设计。这些攻击旨在（1）增强GaME框架生成的防御措施，（2）验证其对于未预料的攻击的强健性。总之，我们的框架和分析通过提供新的组合攻击和防御公式，将对抗机器学习领域推向前沿。