Adversarial risk quantifies the performance of classifiers on adversarially perturbed data. Numerous definitions of adversarial risk -- not all mathematically rigorous and differing subtly in the details -- have appeared in the literature. In this paper, we revisit these definitions, make them rigorous, and critically examine their similarities and differences. Our technical tools derive from optimal transport, robust statistics, functional analysis, and game theory. Our contributions include the following: generalizing Strassen's theorem to the unbalanced optimal transport setting with applications to adversarial classification with unequal priors; showing an equivalence between adversarial robustness and robust hypothesis testing with $\infty$-Wasserstein uncertainty sets; proving the existence of a pure Nash equilibrium in the two-player game between the adversary and the algorithm; and characterizing adversarial risk by the minimum Bayes error between a pair of distributions belonging to the $\infty$-Wasserstein uncertainty sets. Our results generalize and deepen recently discovered connections between optimal transport and adversarial robustness and reveal new connections to Choquet capacities and game theory.
翻译:Aversarial 风险量化了对抗性扰动数据分类员的性能。 文献中出现了许多对抗性风险的定义, 其细节并非数学上严格, 且各有不同。 在本文中, 我们重新审视了这些定义, 使其更加严格, 并批判性地检查它们的相似性和差异。 我们的技术工具来自最佳运输、 可靠的统计、 功能分析以及游戏理论。 我们的贡献包括以下内容: 将Strassen 的理论概括为不平衡的最佳运输环境, 应用对抗性分类时使用不平等的先前数据; 显示对抗性强力和强力假设测试之间的等值, 并用 $\ infty$- Wasserstein 的不确定性组合表示; 证明在对手与算法的双人游戏中存在纯纳什平衡; 将属于 $\ infty $- Wasserstein 的一对配配方之间最小的错误描述为对抗性风险。 我们的结果概括并加深了最近发现的在最佳运输与对抗性强性强和对抗性强性之间联系, 并揭示了与Cauquet 能力和游戏理论的新连接。