走向神经网络在反向培训中的记忆化效果 (Towards the Memorization Effect of Neural Networks in Adversarial Training)

Recent studies suggest that ``memorization'' is one important factor for overparameterized deep neural networks (DNNs) to achieve optimal performance. Specifically, the perfectly fitted DNNs can memorize the labels of many atypical samples, generalize their memorization to correctly classify test atypical samples and enjoy better test performance. While, DNNs which are optimized via adversarial training algorithms can also achieve perfect training performance by memorizing the labels of atypical samples, as well as the adversarially perturbed atypical samples. However, adversarially trained models always suffer from poor generalization, with both relatively low clean accuracy and robustness on the test set. In this work, we study the effect of memorization in adversarial trained DNNs and disclose two important findings: (a) Memorizing atypical samples is only effective to improve DNN's accuracy on clean atypical samples, but hardly improve their adversarial robustness and (b) Memorizing certain atypical samples will even hurt the DNN's performance on typical samples. Based on these two findings, we propose Benign Adversarial Training (BAT) which can facilitate adversarial training to avoid fitting ``harmful'' atypical samples and fit as more ``benign'' atypical samples as possible. In our experiments, we validate the effectiveness of BAT, and show it can achieve better clean accuracy vs. robustness trade-off than baseline methods, in benchmark datasets such as CIFAR100 and Tiny~ImageNet.

翻译：最近的研究表明,“ 乳化” 是过度量化深度神经网络( DNNs) 实现最佳性能的重要因素之一。具体地说, 设备完善的 DNNs 可以对许多非典型样本的标签进行记忆化, 将其记忆化, 以正确分类测试非典型样本, 并享有更好的测试性能。虽然通过对抗性培训算法优化的DNS也可以通过对非典型样本标签进行记忆化, 以及对称扭曲的异常样本, 实现完美的培训性能。然而, 完全完善的 DNNE 测试模型总是缺乏一般性能, 测试集的清洁性能和稳健性都相对较低。在这项工作中,我们研究在对非典型样本进行记忆化测试的效果, 并披露两个重要的调查结果:(a) 使用非典型样本的记忆化只能提高DNNE的准确性,但几乎无法提高对非典型样本的对抗性能, 并且(b) 将某些非典型样本的模拟性样本的性能损害DNNEAT的性, 在典型样本上实现更稳健健的基的基性测试, 。根据两项测试, 我们建议B' 使Bral的样本进行更精确的样本进行更精确的测试。