Recently, the robustness of deep learning models has received widespread attention, and various methods for improving model robustness have been proposed, including adversarial training, model architecture modification, design of loss functions, certified defenses, and so on. However, the principle of the robustness to attacks is still not fully understood, also the related research is still not sufficient. Here, we have identified a significant factor that affects the robustness of models: the distribution characteristics of softmax values for non-real label samples. We found that the results after an attack are highly correlated with the distribution characteristics, and thus we proposed a loss function to suppress the distribution diversity of softmax. A large number of experiments have shown that our method can improve robustness without significant time consumption.
翻译:最近,深度学习模型的鲁棒性受到了广泛关注,并提出了各种提高模型鲁棒性的方法,包括对抗训练、模型架构修改、损失函数的设计、认证防御等。然而,模型鲁棒性的原理仍不完全理解,相关研究仍不充分。在这里,我们确定了影响模型鲁棒性的一个重要因素:非真实标签样本的softmax值的分布特征。我们发现,攻击后的结果与分布特征高度相关,因此我们提出了一种损失函数来抑制softmax的分布多样性。大量实验表明,我们的方法可以提高鲁棒性而不会显著消耗时间。