自然后门对文本数据的攻击 (Natural Backdoor Attack on Text Data)

Recently, advanced NLP models have seen a surge in the usage of various applications. This raises the security threats of the released models. In addition to the clean models' unintentional weaknesses, {\em i.e.,} adversarial attacks, the poisoned models with malicious intentions are much more dangerous in real life. However, most existing works currently focus on the adversarial attacks on NLP models instead of positioning attacks, also named \textit{backdoor attacks}. In this paper, we first propose the \textit{natural backdoor attacks} on NLP models. Moreover, we exploit the various attack strategies to generate trigger on text data and investigate different types of triggers based on modification scope, human recognition, and special cases. Last, we evaluate the backdoor attacks, and the results show the excellent performance of with 100\% backdoor attacks success rate and sacrificing of 0.83\% on the text classification task.

翻译：最近,先进的NLP模型发现各种应用的使用激增,这增加了释放模型的安全威胁。除了清洁模型的无意弱点,即对抗性攻击之外,恶意的有毒模型在现实生活中更加危险。然而,大多数现有工作目前侧重于对NLP模型的对抗性攻击,而不是定位攻击,也称为\ textit{后门攻击}。在本文中,我们首先提议对NLP模型采用\ textit{自然后门攻击}。此外,我们利用各种攻击战略来触发文本数据并调查基于修改范围、人类识别和特殊案例的不同类型的触发。最后,我们评估后门攻击,结果显示100 ⁇ 后门攻击成功率和在文本分类任务上牺牲0.83 ⁇ 。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/