培训前语言模式应如何精巧地面向逆向强势? (How Should Pre-Trained Language Models Be Fine-Tuned Towards Adversarial Robustness?) - 专知论文

会员服务 ·

0

语言模型化 · 稳健性 · MoDELS · 情感分析 · INFORMS ·

2021 年 12 月 22 日

How Should Pre-Trained Language Models Be Fine-Tuned Towards Adversarial Robustness?

翻译：培训前语言模式应如何精巧地面向逆向强势?

Xinhsuai Dong,Luu Anh Tuan,Min Lin,Shuicheng Yan,Hanwang Zhang

from arxiv, Accepted by NeurIPS-2021

The fine-tuning of pre-trained language models has a great success in many NLP fields. Yet, it is strikingly vulnerable to adversarial examples, e.g., word substitution attacks using only synonyms can easily fool a BERT-based sentiment analysis model. In this paper, we demonstrate that adversarial training, the prevalent defense technique, does not directly fit a conventional fine-tuning scenario, because it suffers severely from catastrophic forgetting: failing to retain the generic and robust linguistic features that have already been captured by the pre-trained model. In this light, we propose Robust Informative Fine-Tuning (RIFT), a novel adversarial fine-tuning method from an information-theoretical perspective. In particular, RIFT encourages an objective model to retain the features learned from the pre-trained model throughout the entire fine-tuning process, whereas a conventional one only uses the pre-trained weights for initialization. Experimental results show that RIFT consistently outperforms the state-of-the-arts on two popular NLP tasks: sentiment analysis and natural language inference, under different attacks across various pre-trained language models.

翻译：培训前语言模型的微调在许多国家语言平台领域取得了巨大成功。然而,它明显容易受到对抗性例子的影响,例如,仅使用同义词的词替换攻击很容易愚弄基于BERT的情绪分析模型。在本文中,我们证明,激烈的对抗性培训,即流行的国防技术,并不直接适合常规的微调假设,因为它遭受灾难性的忘记:未能保留预先培训模式已经捕捉到的通用和稳健的语言特征。因此,我们提议采用一种新型的对抗性微调方法(RIFT),即从信息理论角度出发,用新颖的对抗性词替换性微调方法。特别是,RIFT鼓励一种客观模式,在整个微调过程中保留从预先培训过的模式中学到的特征,而传统模式只使用预先训练过重来进行初始化。实验结果表明,REFT始终超越了两种流行的国家语言平台任务:情绪分析和自然语言推导,在不同语言预选模式下进行不同的攻击。

0

相关内容

语言模型化

语言模型化

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

专知

16+阅读 · 2018年5月14日

可拓支持向量机理论、方法与应用研究

国家自然科学基金

1+阅读 · 2014年12月31日

面向智能电网基础设施Cyber-Physical安全的自治愈基础理论研究

国家自然科学基金

1+阅读 · 2013年12月31日

ICC胃动素受体在红霉素促胃肠动力中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型钠钙交换蛋白NCEX对糖尿病大血管病变的作用和黄芪多糖的干预机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

HDAC抑制剂治疗视网膜感光细胞变性的分子基础

国家自然科学基金

1+阅读 · 2011年12月31日

NOX在UVA诱导的肥大细胞胞浆钙振荡中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

高效生防菌Bs-916的突变菌株M49表面活性素合成能力降低的内在分子机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于 EC-SMC-MC共培养体系的参莲提取物防治AS作用评价及机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

抗原特异性和非抗原特异性CD4+CD25+ Treg细胞对Th1细胞分化、效应功能和记忆Th1细胞形成的影响

国家自然科学基金

0+阅读 · 2008年12月31日

Impossible Triangle: What's Next for Pre-trained Language Models?

Arxiv

0+阅读 · 2022年4月20日

Analyzing Gender Representation in Multilingual Models

Arxiv

0+阅读 · 2022年4月20日

Contrastive Demonstration Tuning for Pre-trained Language Models

Arxiv

0+阅读 · 2022年4月18日

Language Contamination Explains the Cross-lingual Capabilities of English Pretrained Models

Arxiv

0+阅读 · 2022年4月17日

SimpleBERT: A Pre-trained Model That Learns to Generate Simple Words

Arxiv

0+阅读 · 2022年4月16日

Just Fine-tune Twice: Selective Differential Privacy for Large Language Models

Arxiv

1+阅读 · 2022年4月15日

A Survey of Knowledge Enhanced Pre-trained Models

Arxiv

28+阅读 · 2021年10月1日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Arxiv

17+阅读 · 2020年9月8日

Pre-trained Models for Natural Language Processing: A Survey

Arxiv

113+阅读 · 2020年3月18日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

专知

16+阅读 · 2018年5月14日

相关论文

Impossible Triangle: What's Next for Pre-trained Language Models?

Arxiv

0+阅读 · 2022年4月20日

Analyzing Gender Representation in Multilingual Models

Arxiv

0+阅读 · 2022年4月20日

Contrastive Demonstration Tuning for Pre-trained Language Models

Arxiv

0+阅读 · 2022年4月18日

Language Contamination Explains the Cross-lingual Capabilities of English Pretrained Models

Arxiv

0+阅读 · 2022年4月17日

SimpleBERT: A Pre-trained Model That Learns to Generate Simple Words

Arxiv

0+阅读 · 2022年4月16日

Just Fine-tune Twice: Selective Differential Privacy for Large Language Models

Arxiv

1+阅读 · 2022年4月15日

A Survey of Knowledge Enhanced Pre-trained Models

Arxiv

28+阅读 · 2021年10月1日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Arxiv

17+阅读 · 2020年9月8日

Pre-trained Models for Natural Language Processing: A Survey

Arxiv

113+阅读 · 2020年3月18日

相关基金

可拓支持向量机理论、方法与应用研究

国家自然科学基金

1+阅读 · 2014年12月31日

面向智能电网基础设施Cyber-Physical安全的自治愈基础理论研究

国家自然科学基金

1+阅读 · 2013年12月31日

ICC胃动素受体在红霉素促胃肠动力中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型钠钙交换蛋白NCEX对糖尿病大血管病变的作用和黄芪多糖的干预机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

HDAC抑制剂治疗视网膜感光细胞变性的分子基础

国家自然科学基金

1+阅读 · 2011年12月31日

NOX在UVA诱导的肥大细胞胞浆钙振荡中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

高效生防菌Bs-916的突变菌株M49表面活性素合成能力降低的内在分子机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于 EC-SMC-MC共培养体系的参莲提取物防治AS作用评价及机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

抗原特异性和非抗原特异性CD4+CD25+ Treg细胞对Th1细胞分化、效应功能和记忆Th1细胞形成的影响

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员