Badpre:对培训前国家劳工计划基金会模型的特异知性后门袭击 (BadPre: Task-agnostic Backdoor Attacks to Pre-trained NLP Foundation Models)

Pre-trained Natural Language Processing (NLP) models can be easily adapted to a variety of downstream language tasks. This significantly accelerates the development of language models. However, NLP models have been shown to be vulnerable to backdoor attacks, where a pre-defined trigger word in the input text causes model misprediction. Previous NLP backdoor attacks mainly focus on some specific tasks. This makes those attacks less general and applicable to other kinds of NLP models and tasks. In this work, we propose \Name, the first task-agnostic backdoor attack against the pre-trained NLP models. The key feature of our attack is that the adversary does not need prior information about the downstream tasks when implanting the backdoor to the pre-trained model. When this malicious model is released, any downstream models transferred from it will also inherit the backdoor, even after the extensive transfer learning process. We further design a simple yet effective strategy to bypass a state-of-the-art defense. Experimental results indicate that our approach can compromise a wide range of downstream NLP tasks in an effective and stealthy way.

翻译：受过培训的自然语言处理模式( NLP) 很容易适应各种下游语言任务。这大大加快了语言模式的发展。但是, NLP 模式被证明很容易受到后门攻击, 输入文本中预设的触发词导致模式错误。以前的 NLP 后门攻击主要侧重于某些具体任务。这使得这些攻击不那么一般,并适用于其他类型的自然语言处理模式和任务。在此工作中, 我们提议\ Name, 第一次任务- 不可知性后门攻击先训练的NLP 模式。我们攻击的关键特征是, 在将后门植入预培训模式时, 对手不需要关于下游任务的事先信息。当这种恶意模式被释放时, 从中转移的任何下游模式也会继承后门, 即使在广泛的转移学习过程之后。我们进一步设计一个简单有效的战略, 绕过国家艺术防御。实验结果显示, 我们的方法可以有效、和隐形地妥协下游下游国家语言任务。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日