Recent studies have found that removing the norm-bounded projection and increasing search steps in adversarial training can significantly improve robustness. However, we observe that a too large number of search steps can hurt accuracy. We aim to obtain strong robustness efficiently using fewer steps. Through a toy experiment, we find that perturbing the clean data to the decision boundary but not crossing it does not degrade the test accuracy. Inspired by this, we propose friendly adversarial data augmentation (FADA) to generate friendly adversarial data. On top of FADA, we propose geometry-aware adversarial training (GAT) to perform adversarial training on friendly adversarial data so that we can save a large number of search steps. Comprehensive experiments across two widely used datasets and three pre-trained language models demonstrate that GAT can obtain stronger robustness via fewer steps. In addition, we provide extensive empirical results and in-depth analyses on robustness to facilitate future studies.
翻译:最近的研究发现,在对抗性培训中取消受规范约束的预测和增加搜索步骤可以大大改善防御性,然而,我们发现,太多的搜索步骤会损害准确性。我们的目标是利用较少的步骤获得强健性。我们通过玩具试验发现,干扰清洁数据进入决定边界,但不能越过它不会降低测试准确性。我们为此提议友好的对抗性数据增强(FADA)以生成友好的对抗性数据。除了FADA外,我们提议进行几何觉觉觉识对抗性培训,以进行友好对抗性数据方面的对抗性培训,从而节省大量搜索步骤。在两种广泛使用的数据集和三种预先培训的语言模型上的全面实验表明,GAT可以通过较少的步骤获得更强的稳性。此外,我们提供了广泛的实证结果和对强性进行深入分析,以促进今后的研究。