Is there a classifier that ensures optimal robustness against all adversarial attacks? This paper answers this question by adopting a game-theoretic point of view. We show that adversarial attacks and defenses form an infinite zero-sum game where classical results (e.g. Sion theorem) do not apply. We demonstrate the non-existence of a Nash equilibrium in our game when the classifier and the Adversary are both deterministic, hence giving a negative answer to the above question in the deterministic regime. Nonetheless, the question remains open in the randomized regime. We tackle this problem by showing that, undermild conditions on the dataset distribution, any deterministic classifier can be outperformed by a randomized one. This gives arguments for using randomization, and leads us to a new algorithm for building randomized classifiers that are robust to strong adversarial attacks. Empirical results validate our theoretical analysis, and show that our defense method considerably outperforms Adversarial Training against state-of-the-art attacks.
翻译:是否有一个分类器可以确保针对所有对抗性攻击的最佳稳健性? 本文通过采用游戏理论观点来回答这个问题。 我们显示对抗性攻击和防御形成无限零和游戏, 传统结果( 如Sion 理论)不适用。 当分类器和反向者都具有确定性时, 我们的游戏中就不存在纳什平衡, 从而在确定性制度中对上述问题给出否定的答案 。 尽管如此, 这个问题在随机化制度中仍然没有答案 。 我们通过显示在数据集分布的模型条件下, 任何确定性分类器都可能比随机化的分类器强。 这为使用随机化提供了论据, 并引导我们找到一种新的算法, 用于构建随机化的分类器, 以强势对抗性攻击。 经验性结果证实了我们的理论分析, 并表明我们的防御方法大大超越了对州级攻击的自动训练。