M-to-N 后门范式:隐形和模糊袭击深学习模型 (M-to-N Backdoor Paradigm: A Stealthy and Fuzzy Attack to Deep Learning Models)

Recent studies show that deep neural networks (DNNs) are vulnerable to backdoor attacks. A backdoor DNN model behaves normally with clean inputs, whereas outputs attacker's expected behaviors when the inputs contain a pre-defined pattern called a trigger. However, in some tasks, the attacker cannot know the exact target that shows his/her expected behavior, because the task may contain a large number of classes and the attacker does not have full access to know the semantic details of these classes. Thus, the attacker is willing to attack multiple suspected targets to achieve his/her purpose. In light of this, in this paper, we propose the M-to-N backdoor attack, a new attack paradigm that allows an attacker to launch a fuzzy attack by simultaneously attacking N suspected targets, and each of the N targets can be activated by any one of its M triggers. To achieve a better stealthiness, we randomly select M clean images from the training dataset as our triggers for each target. Since the triggers used in our attack have the same distribution as the clean images, the inputs poisoned by the triggers are difficult to be detected by the input-based defenses, and the backdoor models trained on the poisoned training dataset are also difficult to be detected by the model-based defenses. Thus, our attack is stealthier and has a higher probability of achieving the attack purpose by attacking multiple suspected targets simultaneously in contrast to prior backdoor attacks. Extensive experiments show that our attack is effective against different datasets with various models and achieves high attack success rates (e.g., 99.43% for attacking 2 targets and 98.23% for attacking 4 targets on the CIFAR-10 dataset) when poisoning only an extremely small portion of the training dataset (e.g., less than 2%). Besides, it is robust to pre-processing operations and can resist state-of-the-art defenses.

翻译：最近的研究显示,深心神经网络(DNN)很容易受到幕后攻击。后门DNN模式通常使用干净的投入,而产出攻击者在输入含有预定义的触发模式时的预期行为。然而,在某些任务中,攻击者无法知道显示其预期行为的确切目标,因为任务可能包含大量类别,攻击者无法完全了解这些类别的语义细节。因此,攻击者愿意攻击多个疑似目标,以达到他/她的目的。鉴于此,在本文件中,我们提议M-N后门攻击,而产出攻击者预期的行为则在输入含有预定义的触发模式时,称为触发器。但是在某些任务中,攻击者无法知道显示他/她的预期行为的确切目标,因为任务可能包含大量的类别,攻击者无法完全了解这些类别的语义细节细节。因此,攻击者可以随机从训练数据集中选择M干净的图像作为我们每个目标的触发器。由于我们攻击中使用的触发器与清洁的图像一样分布,触发器的毒性攻击目标在前方,攻击目标的频率在前方数据中也很难被检测到前方数据输入。在前方数据输入。在前方的概率中,因此很难辨测测测测测测到前的概率。在前的概率攻击中,攻击的概率值是用来测得的概率攻击目标。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日