Despite their overwhelming success on a wide range of applications, convolutional neural networks (CNNs) are widely recognized to be vulnerable to adversarial examples. This intriguing phenomenon led to a competition between adversarial attacks and defense techniques. So far, adversarial training is the most widely used method for defending against adversarial attacks. It has also been extended to defend against universal adversarial perturbations (UAPs). The SOTA universal adversarial training (UAT) method optimizes a single perturbation for all training samples in the mini-batch. In this work, we find that a UAP does not attack all classes equally. Inspired by this observation, we identify it as the source of the model having unbalanced robustness. To this end, we improve the SOTA UAT by proposing to utilize class-wise UAPs during adversarial training. On multiple benchmark datasets, our class-wise UAT leads superior performance for both clean accuracy and adversarial robustness against universal attack.
翻译:尽管在广泛的应用中取得了压倒性的成功,但人们广泛承认,神经神经网络(CNNs)很容易受到对抗性例子的影响。这种令人感兴趣的现象导致了对抗性攻击和防御技术之间的竞争。到目前为止,对抗性训练是对抗性攻击最广泛使用的防御方法。它还被扩大到防范普遍对抗性攻击(UAPs),SOTA普遍对抗性训练(UATs)方法优化了小型批量中所有训练样品的单一扰动。在这项工作中,我们发现UAP不平等地攻击所有课程。我们受这一观察的启发,我们把它确定为模式的源头,具有不平衡的强力。为此,我们提出在对抗性训练期间使用优雅的SOTA UAT(UATs)方法,以防范普遍对抗性攻击。在多个基准数据集中,我们的班级UAT(UAT)在清洁准确性和对抗性攻击的对抗性强力上都表现优。