硬反诉程序对反诉培训过度适用的影响 (On the Impact of Hard Adversarial Instances on Overfitting in Adversarial Training)

Adversarial training is a popular method to robustify models against adversarial attacks. However, it exhibits much more severe overfitting than training on clean inputs. In this work, we investigate this phenomenon from the perspective of training instances, i.e., training input-target pairs. Based on a quantitative metric measuring instances' difficulty, we analyze the model's behavior on training instances of different difficulty levels. This lets us show that the decay in generalization performance of adversarial training is a result of the model's attempt to fit hard adversarial instances. We theoretically verify our observations for both linear and general nonlinear models, proving that models trained on hard instances have worse generalization performance than ones trained on easy instances. Furthermore, we prove that the difference in the generalization gap between models trained by instances of different difficulty levels increases with the size of the adversarial budget. Finally, we conduct case studies on methods mitigating adversarial overfitting in several scenarios. Our analysis shows that methods successfully mitigating adversarial overfitting all avoid fitting hard adversarial instances, while ones fitting hard adversarial instances do not achieve true robustness.

翻译：对抗性培训是强化对抗性攻击模式的流行方法。但是,它比清洁投入培训要严重得多。在这项工作中,我们从培训实例的角度来调查这一现象,即培训投入目标对等。根据定量衡量实例的困难,我们分析了模型在不同困难程度的培训实例方面的行为。这让我们可以表明,对抗性培训一般化表现的衰败是该模型试图适应硬性对抗性攻击实例的结果。我们理论上核查了我们对线性和一般非线性模型的观察,证明在困难情况下培训的模型比在简单例子中培训的模型的差。此外,我们还证明,在不同的困难程度培训的模型之间,一般化差距随着对抗性预算规模的大小而加大。最后,我们进行了关于减轻对抗性过分适应性攻击的方法的个案研究。我们的分析表明,成功地减少对抗性过分适应性所有硬性对抗性对抗性攻击实例的方法避免了适当的硬性对抗性攻击实例,而那些适合硬性对抗性对抗性攻击实例的模型则没有达到真正的强健性。

相关内容

过拟合

关注 8

过拟合，在AI领域多指机器学习得到模型太过复杂，导致在训练集上表现很好，然而在测试集上却不尽人意。过拟合（over-fitting）也称为过学习，它的直观表现是算法在训练集上表现好，但在测试集上表现不好，泛化性能差。过拟合是在模型参数拟合过程中由于训练数据包含抽样误差，在训练时复杂的模型将抽样误差也进行了拟合导致的。

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

迁移学习简明教程，11页ppt

专知会员服务

108+阅读 · 2020年8月4日