Adversarial training has been empirically shown to be more prone to overfitting than standard training. The exact underlying reasons still need to be fully understood. In this paper, we identify one cause of overfitting related to current practices of generating adversarial samples from misclassified samples. To address this, we propose an alternative approach that leverages the misclassified samples to mitigate the overfitting problem. We show that our approach achieves better generalization while having comparable robustness to state-of-the-art adversarial training methods on a wide range of computer vision, natural language processing, and tabular tasks.
翻译:实践证明,反向培训比标准培训更容易过分适应标准培训,确切的基本原因仍有待充分理解。在本文件中,我们发现与目前从分类不当的样本中产生对抗性样本的做法有关的一个原因。为解决这一问题,我们建议了一种替代方法,利用分类错误的样本来缓解过分适应问题。我们表明,我们的方法在与最先进的对抗性培训方法相比,在广泛的计算机视觉、自然语言处理和表格任务方面实现了更好的概括化。