Neural networks are susceptible to artificially designed adversarial perturbations. Recent efforts have shown that imposing certain modifications on classification layer can improve the robustness of the neural networks. In this paper, we explicitly construct a dense orthogonal weight matrix whose entries have the same magnitude, thereby leading to a novel robust classifier. The proposed classifier avoids the undesired structural redundancy issue in previous work. Applying this classifier in standard training on clean data is sufficient to ensure the high accuracy and good robustness of the model. Moreover, when extra adversarial samples are used, better robustness can be further obtained with the help of a special worst-case loss. Experimental results show that our method is efficient and competitive to many state-of-the-art defensive approaches. Our code is available at \url{https://github.com/MTandHJ/roboc}.
翻译:神经网络容易受到人为设计的对抗性扰动。 最近的努力表明,对分类层进行某些修改可以提高神经网络的稳健性。 在本文中,我们明确建立一个密集的正对重力矩阵,其条目大小相同,从而形成一个新的稳健的分类器。拟议的分类器避免了以往工作中不受欢迎的结构冗余问题。在清洁数据标准培训中应用这一分类器足以确保模型的高度准确性和稳健性。此外,在使用额外的对抗性样本时,在出现最坏情况的特殊损失的情况下,还可以进一步获得更好的稳健性。实验结果显示,我们的方法对于许多最先进的防御方法是高效和有竞争力的。我们的代码可在以下网站查阅:https://github.com/MTandHJ/roboc}。