Adversarial Robustness is a growing field that evidences the brittleness of neural networks. Although the literature on adversarial robustness is vast, a dimension is missing in these studies: assessing how severe the mistakes are. We call this notion "Adversarial Severity" since it quantifies the downstream impact of adversarial corruptions by computing the semantic error between the misclassification and the proper label. We propose to study the effects of adversarial noise by measuring the Robustness and Severity into a large-scale dataset: iNaturalist-H. Our contributions are: (i) we introduce novel Hierarchical Attacks that harness the rich structured space of labels to create adversarial examples. (ii) These attacks allow us to benchmark the Adversarial Robustness and Severity of classification models. (iii) We enhance the traditional adversarial training with a simple yet effective Hierarchical Curriculum Training to learn these nodes gradually within the hierarchical tree. We perform extensive experiments showing that hierarchical defenses allow deep models to boost the adversarial Robustness by 1.85% and reduce the severity of all attacks by 0.17, on average.
翻译:Aversarial Adversarial 强力是一个日益壮大的领域,可以证明神经网络的萎缩性。虽然关于对抗性强力的文献非常广泛,但这些研究中缺少一个层面:评估错误有多严重。我们称这个概念为“反对称差异性”,因为它通过计算分类错误和适当标签之间的语义错误来量化对抗性腐败的下游影响。我们提议研究对抗性噪音的影响,将强力和强度测量成一个大型数据集:iNatalist-H。 我们的贡献是:(一) 我们引入新的等级式攻击,利用丰富的结构化标签空间来创建对抗性例子。 (二) 这些攻击使我们能够用简单而有效的高压性课程培训来衡量Adversari的强力性和多样性。 我们进行了广泛的实验,显示等级性防御可以让深层模型把对抗性强力强力强力强力强力强力强力强力强力强力强力强力强力强力强力强力强力强力强力强力强力强力强力强力强力强力的模型。