关于语言模式的无培训的Lexical 后门攻击 (Training-free Lexical Backdoor Attacks on Language Models) - 专知论文

会员服务 ·

0

语言模型化 · MoDELS · Processing（编程语言） · Extensibility · HTTPS ·

2023 年 2 月 8 日

Training-free Lexical Backdoor Attacks on Language Models

翻译：关于语言模式的无培训的Lexical 后门攻击

Yujin Huang,Terry Yue Zhuo,Qiongkai Xu,Han Hu,Xingliang Yuan,Chunyang Chen

from arxiv, Accepted to International World Wide Web Conference 2023, Security, Privacy & Trust Track

Large-scale language models have achieved tremendous success across various natural language processing (NLP) applications. Nevertheless, language models are vulnerable to backdoor attacks, which inject stealthy triggers into models for steering them to undesirable behaviors. Most existing backdoor attacks, such as data poisoning, require further (re)training or fine-tuning language models to learn the intended backdoor patterns. The additional training process however diminishes the stealthiness of the attacks, as training a language model usually requires long optimization time, a massive amount of data, and considerable modifications to the model parameters. In this work, we propose Training-Free Lexical Backdoor Attack (TFLexAttack) as the first training-free backdoor attack on language models. Our attack is achieved by injecting lexical triggers into the tokenizer of a language model via manipulating its embedding dictionary using carefully designed rules. These rules are explainable to human developers which inspires attacks from a wider range of hackers. The sparse manipulation of the dictionary also habilitates the stealthiness of our attack. We conduct extensive experiments on three dominant NLP tasks based on nine language models to demonstrate the effectiveness and universality of our attack. The code of this work is available at https://github.com/Jinxhy/TFLexAttack.

翻译：大型语言模式在各种自然语言处理(NLP)应用中取得了巨大成功。然而,语言模式在后门攻击中非常脆弱,因此很容易被后门攻击,而后门攻击将隐蔽地触发到引导其发生不良行为的模式中。大多数现有的后门攻击,如数据中毒,需要进一步(再)培训或微调语言模式,以了解预期的后门模式。额外的培训过程减少了攻击的隐形性,因为培训一种语言模式通常需要较长的优化时间、大量的数据和对模型参数的大量修改。在这项工作中,我们提议将培训免费的后门攻击(TFLexAttack)作为首次对语言模式进行无训练的后门攻击。我们的攻击是通过使用精心设计的规则对嵌入词典进行语言模式的代记器进行注射(再培训或微调)来达到的。这些规则可以向鼓励来自范围更广的黑客的攻击的人开发者解释。词典的微调用也说明了我们攻击的隐性。我们在9种语言模式下对NLPP任务进行了广泛的实验。我们攻击了三种主要的NLP任务,这是可以用来证明我们这个定义/攻击规则的普遍性。

0

相关内容

语言模型化

语言模型化

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

自噬在硫酸卡那霉素慢性致聋大鼠耳蜗核神经元中的作用及机制

国家自然科学基金

0+阅读 · 2015年12月31日

Bacillus megaterium Q3降解二氯喹啉酸分子机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

新型KSi储氢合金的制备及性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

p28Gankyrin经由keap1-Nrf2通路正调控肝癌细胞抗氧化应激损伤的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

CHOP 调控ERO1α在急性肝损伤中的作用及其机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

IL-10调控Th17细胞在慢性HBV感染肝损伤中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

GSK-3β调控血管平滑肌细胞特异性转录因子Myocardin对动脉粥样硬化斑块形成作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

microRNA-206靶向ZFP580调控球囊损伤后平滑肌细胞表型转换的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

PhoBR双组份调控系统对胸膜肺炎放线杆菌致病性调控机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

A Survey of Knowledge-Enhanced Pre-trained Language Models

Arxiv

18+阅读 · 2022年11月17日

Adversarial Robustness of Representation Learning for Knowledge Graphs

Arxiv

10+阅读 · 2022年9月30日

Deep Learning for Medical Image Segmentation: Tricks, Challenges and Future Directions

Arxiv

21+阅读 · 2022年9月21日

AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Arxiv

23+阅读 · 2021年8月12日

Pre-Trained Models: Past, Present and Future

Arxiv

19+阅读 · 2021年6月15日

Adaptive Consistency Regularization for Semi-Supervised Transfer Learning

Arxiv

23+阅读 · 2021年3月3日

Privacy and Robustness in Federated Learning: Attacks and Defenses

Arxiv

35+阅读 · 2020年12月7日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

Backdoor Learning: A Survey

Arxiv

14+阅读 · 2020年10月26日

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Arxiv

15+阅读 · 2018年10月11日

VIP会员

文章信息

相关主题

语言模型化

Processing（编程语言）

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

相关论文

A Survey of Knowledge-Enhanced Pre-trained Language Models

Arxiv

18+阅读 · 2022年11月17日

Adversarial Robustness of Representation Learning for Knowledge Graphs

Arxiv

10+阅读 · 2022年9月30日

Deep Learning for Medical Image Segmentation: Tricks, Challenges and Future Directions

Arxiv

21+阅读 · 2022年9月21日

AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Arxiv

23+阅读 · 2021年8月12日

Pre-Trained Models: Past, Present and Future

Arxiv

19+阅读 · 2021年6月15日

Adaptive Consistency Regularization for Semi-Supervised Transfer Learning

Arxiv

23+阅读 · 2021年3月3日

Privacy and Robustness in Federated Learning: Attacks and Defenses

Arxiv

35+阅读 · 2020年12月7日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

Backdoor Learning: A Survey

Arxiv

14+阅读 · 2020年10月26日

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Arxiv

15+阅读 · 2018年10月11日

相关基金

自噬在硫酸卡那霉素慢性致聋大鼠耳蜗核神经元中的作用及机制

国家自然科学基金

0+阅读 · 2015年12月31日

Bacillus megaterium Q3降解二氯喹啉酸分子机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

新型KSi储氢合金的制备及性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

p28Gankyrin经由keap1-Nrf2通路正调控肝癌细胞抗氧化应激损伤的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

CHOP 调控ERO1α在急性肝损伤中的作用及其机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

IL-10调控Th17细胞在慢性HBV感染肝损伤中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

GSK-3β调控血管平滑肌细胞特异性转录因子Myocardin对动脉粥样硬化斑块形成作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

microRNA-206靶向ZFP580调控球囊损伤后平滑肌细胞表型转换的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

PhoBR双组份调控系统对胸膜肺炎放线杆菌致病性调控机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员