Robust overfitting widely exists in adversarial training of deep networks. The exact underlying reasons for this are still not completely understood. Here, we explore the causes of robust overfitting by comparing the data distribution of \emph{non-overfit} (weak adversary) and \emph{overfitted} (strong adversary) adversarial training, and observe that the distribution of the adversarial data generated by weak adversary mainly contain small-loss data. However, the adversarial data generated by strong adversary is more diversely distributed on the large-loss data and the small-loss data. Given these observations, we further designed data ablation adversarial training and identify that some small-loss data which are not worthy of the adversary strength cause robust overfitting in the strong adversary mode. To relieve this issue, we propose \emph{minimum loss constrained adversarial training} (MLCAT): in a minibatch, we learn large-loss data as usual, and adopt additional measures to increase the loss of the small-loss data. Technically, MLCAT hinders data fitting when they become easy to learn to prevent robust overfitting; philosophically, MLCAT reflects the spirit of turning waste into treasure and making the best use of each adversarial data; algorithmically, we designed two realizations of MLCAT, and extensive experiments demonstrate that MLCAT can eliminate robust overfitting and further boost adversarial robustness.
翻译:深层次网络的对抗性训练中存在着过于强大的强势, 其确切的基本原因仍然没有得到完全理解。 在这里, 我们探索了强力过度的原因, 比较了对立性训练的数据分布。 我们比较了对立性训练( 弱敌) 和对立性训练( 强敌), 并发现对立性训练( 强敌), 并发现对立性训练( 强敌) 所生成的对立性数据的分布主要包含小损失数据。 但是, 强敌产生的对立性数据在大损失数据和小损失数据上分布得更为多样。 鉴于这些观察, 我们进一步设计了数据对立性训练, 并查明一些不值得对立性训练的小损失数据在强的敌对模式中造成强健的对立性调整。 为了解决这个问题, 我们提议在小型对立性训练性训练中学习大量损失数据, 并采取额外措施, 进一步消除小损失数据的损失。 从技术上讲, 刚解运组织在容易学会如何防止对立性过分的对立性训练时,, 从哲学角度上来说, AT 将最佳的对立性训练性试验反映了对立性训练性试验的每一种对立性试验。