更仔细地审视对立实例的可转让性:它们如何不同地愚弄不同模式 (Closer Look at the Transferability of Adversarial Examples: How They Fool Different Models Differently)

Deep neural networks are vulnerable to adversarial examples (AEs), which have adversarial transferability: AEs generated for the source model can mislead another (target) model's predictions. However, the transferability has not been understood in terms of to which class target model's predictions were misled (i.e., class-aware transferability). In this paper, we differentiate the cases in which a target model predicts the same wrong class as the source model ("same mistake") or a different wrong class ("different mistake") to analyze and provide an explanation of the mechanism. We find that (1) AEs tend to cause same mistakes, which correlates with "non-targeted transferability"; however, (2) different mistakes occur even between similar models, regardless of the perturbation size. Furthermore, we present evidence that the difference between same mistakes and different mistakes can be explained by non-robust features, predictive but human-uninterpretable patterns: different mistakes occur when non-robust features in AEs are used differently by models. Non-robust features can thus provide consistent explanations for the class-aware transferability of AEs.

翻译：深神经网络易受敌对性例子(AEs)的伤害,这些例子具有对抗性可转移性:源模型产生的AEs可能会误导另一个(目标)模型的预测。然而,对于可转移性,并没有从哪个类目标模型的预测被误导的角度来理解(即,等级认知可转移性)。在本文中,我们区分了目标模型预测与源模型(“相同错误”)或不同类别(不同错误”错误)相同的错误类别(不同错误)来分析和解释机制。我们发现(1) AEs往往造成与“非目标转移性”相关联的相同错误;但是,(2) 即使在类似模式之间也发生不同的错误,而不论扰动大小。此外,我们提出的证据表明,相同的错误和不同错误之间的差别可以用非交错特征、预测性但人类无法相互调和模式来解释:在AEs的非交错特性被不同地使用时会发生不同的错误。因此,非交错性特征可以为A-E的可转移性提供一致的解释。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【CVPR 2022】可转移的稀疏对抗性攻击，Transferable Sparse Adversarial Attack

专知会员服务

15+阅读 · 2022年3月12日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日