分配和反向培训基本取舍 (Fundamental Tradeoffs in Distributionally Adversarial Training)

Adversarial training is among the most effective techniques to improve the robustness of models against adversarial perturbations. However, the full effect of this approach on models is not well understood. For example, while adversarial training can reduce the adversarial risk (prediction error against an adversary), it sometimes increase standard risk (generalization error when there is no adversary). Even more, such behavior is impacted by various elements of the learning problem, including the size and quality of training data, specific forms of adversarial perturbations in the input, model overparameterization, and adversary's power, among others. In this paper, we focus on \emph{distribution perturbing} adversary framework wherein the adversary can change the test distribution within a neighborhood of the training data distribution. The neighborhood is defined via Wasserstein distance between distributions and the radius of the neighborhood is a measure of adversary's manipulative power. We study the tradeoff between standard risk and adversarial risk and derive the Pareto-optimal tradeoff, achievable over specific classes of models, in the infinite data limit with features dimension kept fixed. We consider three learning settings: 1) Regression with the class of linear models; 2) Binary classification under the Gaussian mixtures data model, with the class of linear classifiers; 3) Regression with the class of random features model (which can be equivalently represented as two-layer neural network with random first-layer weights). We show that a tradeoff between standard and adversarial risk is manifested in all three settings. We further characterize the Pareto-optimal tradeoff curves and discuss how a variety of factors, such as features correlation, adversary's power or the width of two-layer neural network would affect this tradeoff.

翻译：反向培训是提高模型对对抗性扰动的稳健性的最有效技术之一。但是,这一方法对模型的全面效果并没有得到很好理解。例如, 对抗性培训可以降低对抗性风险( 对手的防偏差), 有时会增加标准风险( 在没有对手时一般化错误 ) 。更严重的是, 这种行为受到学习问题各种要素的影响, 包括培训数据的规模和质量、投入中具体形式的对抗性交易干扰、模型的偏差度, 以及对手的力量等。在本文中, 我们注重于 \ emph{ 分布式对立性 } 对抗性培训培训可以降低对抗性风险风险( 对手对对手的防偏差), 而对抗性培训性培训性培训性数据分布之间的距离有时会增加标准风险。我们研究标准风险与对抗性交易性风险之间的权衡, 并得出最佳交易性交易性交易性交易性交易性交易性交易性的具体类别, 以及对手的实力等同性数据定义。我们认为, 类内分级( ) 分级和等值的分级分级, 分级分级分级。分级分级分级分级。分级分级分级分级分级分级分级分级分级分级分级分级。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【硬核书】信息论，528页pdf，Information Theory and Coding by Example

专知会员服务

148+阅读 · 2020年4月20日

【MIT】对抗鲁棒性的流形正则化，Manifold Regularization for Adversarial Robustness

专知会员服务

28+阅读 · 2020年3月11日

【浙江大学-AAAI2020】领域自适应的对抗损失，Adversarial-Learned Loss for Domain Adaptation

专知会员服务

62+阅读 · 2020年1月11日