文字反对性攻击的 " 实际 " 扰动 (Contextualized Perturbation for Textual Adversarial Attack) - 专知论文

会员服务 ·

0

样例 · 掩码语言模型化 · Extensibility · MoDELS · 语言模型化 ·

2021 年 3 月 15 日

Contextualized Perturbation for Textual Adversarial Attack

翻译：文字反对性攻击的 " 实际 " 扰动

Dianqi Li,Yizhe Zhang,Hao Peng,Liqun Chen,Chris Brockett,Ming-Ting Sun,Bill Dolan

from arxiv, Accepted by NAACL 2021, long paper

Adversarial examples expose the vulnerabilities of natural language processing (NLP) models, and can be used to evaluate and improve their robustness. Existing techniques of generating such examples are typically driven by local heuristic rules that are agnostic to the context, often resulting in unnatural and ungrammatical outputs. This paper presents CLARE, a ContextuaLized AdversaRial Example generation model that produces fluent and grammatical outputs through a mask-then-infill procedure. CLARE builds on a pre-trained masked language model and modifies the inputs in a context-aware manner. We propose three contextualized perturbations, Replace, Insert and Merge, allowing for generating outputs of varied lengths. With a richer range of available strategies, CLARE is able to attack a victim model more efficiently with fewer edits. Extensive experiments and human evaluation demonstrate that CLARE outperforms the baselines in terms of attack success rate, textual similarity, fluency and grammaticality.

翻译：Aversarial 实例暴露了自然语言处理模式的脆弱性,并可用于评估和提高其稳健性。现有的生成这些范例的技术通常受当地超自然规则的驱动,这些规则对背景具有不可知性,往往导致非自然和非语法产出。本文介绍了CLARE,一种通过遮罩-时填充程序产生流畅和语法产出的CEUALLLLAL 反对等实例生成模型。CLARE以预先训练的隐蔽语言模型为基础,以符合背景的方式修改投入。我们建议了三种背景化的扰动、替换、插入和合并,允许产生不同长度的产出。随着现有战略的扩大,CLARE能够以较少的编辑来更有效地攻击受害者模型。广泛的实验和人类评价表明,CLARE在攻击成功率、文字相似性、流利性和语法性方面超过了基线。

0

相关内容

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

【干货书】用Python进行深思熟虑的机器学习, 216页pdf，Thoughtful ML with Python

【干货书】用Python进行深思熟虑的机器学习, 216页pdf，Thoughtful ML with Python

专知会员服务

70+阅读 · 2020年4月4日

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

专知会员服务

11+阅读 · 2019年11月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

已删除

将门创投

8+阅读 · 2019年8月28日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

Composite Adversarial Attacks

Arxiv

12+阅读 · 2020年12月10日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

Adversarial Representation Learning for Text-to-Image Matching

Adversarial Representation Learning for Text-to-Image Matching

Arxiv

6+阅读 · 2019年8月28日

Interpretable Adversarial Training for Text

Interpretable Adversarial Training for Text

Arxiv

5+阅读 · 2019年5月30日

Language GANs Falling Short

Arxiv

7+阅读 · 2018年11月6日

ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks

ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks

Arxiv

6+阅读 · 2018年9月17日

Sequential Attacks on Agents for Long-Term Adversarial Goals

Sequential Attacks on Agents for Long-Term Adversarial Goals

Arxiv

5+阅读 · 2018年7月5日

Attacking Visual Language Grounding with Adversarial Examples: A Case Study on Neural Image Captioning

Arxiv

4+阅读 · 2018年5月22日

An Improved Evaluation Framework for Generative Adversarial Networks

Arxiv

3+阅读 · 2018年3月27日

Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

Arxiv

18+阅读 · 2018年1月5日

VIP会员

文章信息

相关主题

掩码语言模型化

语言模型化

相关VIP内容

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

【干货书】用Python进行深思熟虑的机器学习, 216页pdf，Thoughtful ML with Python

【干货书】用Python进行深思熟虑的机器学习, 216页pdf，Thoughtful ML with Python

专知会员服务

70+阅读 · 2020年4月4日

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

专知会员服务

11+阅读 · 2019年11月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

面向具身操作的视觉-语言-动作模型综述

《多域空战指挥体系：驾驭复杂性的艺术》

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

相关资讯

已删除

将门创投

8+阅读 · 2019年8月28日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

相关论文

Composite Adversarial Attacks

Arxiv

12+阅读 · 2020年12月10日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

Adversarial Representation Learning for Text-to-Image Matching

Adversarial Representation Learning for Text-to-Image Matching

Arxiv

6+阅读 · 2019年8月28日

Interpretable Adversarial Training for Text

Interpretable Adversarial Training for Text

Arxiv

5+阅读 · 2019年5月30日

Language GANs Falling Short

Arxiv

7+阅读 · 2018年11月6日

ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks

ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks

Arxiv

6+阅读 · 2018年9月17日

Sequential Attacks on Agents for Long-Term Adversarial Goals

Sequential Attacks on Agents for Long-Term Adversarial Goals

Arxiv

5+阅读 · 2018年7月5日

Attacking Visual Language Grounding with Adversarial Examples: A Case Study on Neural Image Captioning

Arxiv

4+阅读 · 2018年5月22日

An Improved Evaluation Framework for Generative Adversarial Networks

Arxiv

3+阅读 · 2018年3月27日

Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

Arxiv

18+阅读 · 2018年1月5日

微信扫码咨询专知VIP会员