The worst-case training principle that minimizes the maximal adversarial loss, also known as adversarial training (AT), has shown to be a state-of-the-art approach for enhancing adversarial robustness. Nevertheless, min-max optimization beyond the purpose of AT has not been rigorously explored in the adversarial context. In this paper, we show how a general framework of min-max optimization over multiple domains can be leveraged to advance the design of different types of adversarial attacks. In particular, given a set of risk sources, minimizing the worst-case attack loss can be reformulated as a min-max problem by introducing domain weights that are maximized over the probability simplex of the domain set. We showcase this unified framework in three attack generation problems -- attacking model ensembles, devising universal perturbation under multiple inputs, and crafting attacks resilient to data transformations. Extensive experiments demonstrate that our approach leads to substantial attack improvement over the existing heuristic strategies as well as robustness improvement over state-of-the-art defense methods trained to be robust against multiple perturbation types. Furthermore, we find that the self-adjusted domain weights learned from our min-max framework can provide a holistic tool to explain the difficulty level of attack across domains. Code is available at https://github.com/wangjksjtu/minmax-adv.
翻译:尽量减少最大对抗性损失的最坏情况的培训原则,也称为对抗性训练(AT),已经证明是提高对抗性强力的最先进方法,然而,在对抗性对抗性训练中,没有严格探讨超越AT目的的最小最大优化原则。在本文中,我们展示了如何利用对多个领域进行最小和最大优化的总体框架来推进不同类型对抗性攻击的设计。特别是,鉴于一系列风险来源,尽量减少最坏攻击性损失可以重新拟订为最小最大问题,采用域集概率简单x的最大程度的域权重。我们在三种攻击性新一代问题上展示了这一统一框架 -- -- 攻击模型组装,在多种投入下设计通用扰动,并设计适应数据转换的攻击。广泛的实验表明,我们的方法可以大大改进现有的超重攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性攻击性