Despite of the pervasive existence of multi-label evasion attack, it is an open yet essential problem to characterize the origin of the adversarial vulnerability of a multi-label learning system and assess its attackability. In this study, we focus on non-targeted evasion attack against multi-label classifiers. The goal of the threat is to cause miss-classification with respect to as many labels as possible, with the same input perturbation. Our work gains in-depth understanding about the multi-label adversarial attack by first characterizing the transferability of the attack based on the functional properties of the multi-label classifier. We unveil how the transferability level of the attack determines the attackability of the classifier via establishing an information-theoretic analysis of the adversarial risk. Furthermore, we propose a transferability-centered attackability assessment, named Soft Attackability Estimator (SAE), to evaluate the intrinsic vulnerability level of the targeted multi-label classifier. This estimator is then integrated as a transferability-tuning regularization term into the multi-label learning paradigm to achieve adversarially robust classification. The experimental study on real-world data echos the theoretical analysis and verify the validity of the transferability-regularized multi-label learning method.
翻译:尽管存在普遍的多标签规避攻击,但确定多标签学习系统的对抗性脆弱性的来源并评估其攻击性,这是一个公开但至关重要的问题。在本研究中,我们把重点放在对多标签分类器的非目标规避攻击上。威胁的目标是尽可能对许多标签进行错误分类,同时提供同样的输入干扰。我们的工作通过首先说明基于多标签分类器功能特性的可转移性,从而深入了解多标签对抗性攻击的可转移性,从而获得对多标签分类器的可转移性。我们介绍了攻击的可转移性如何通过建立对对抗性风险的信息理论分析来决定分类器的可攻击性。此外,我们提议对可转移性进行一项以攻击性评估,称为软易易移动性动性刺激器(SAE),以评价目标多标签分类器的内在脆弱性程度。然后将这一估计作为可转移性调整性规范化术语纳入多标签学习模式,以便实现对抗性强的分类。关于现实世界数据可转移性的实验性研究,对经常性进行理论性分析,对真实性进行核查。