识别对文本分类者的反性攻击 (Identifying Adversarial Attacks on Text Classifiers) - 专知论文

会员服务 ·

0

可辨认的 · Extensibility · 可约的 · 内部结点 · Continuity ·

2022 年 1 月 21 日

Identifying Adversarial Attacks on Text Classifiers

翻译：识别对文本分类者的反性攻击

Zhouhang Xie,Jonathan Brophy,Adam Noack,Wencong You,Kalyani Asthana,Carter Perkins,Sabrina Reis,Sameer Singh,Daniel Lowd

The landscape of adversarial attacks against text classifiers continues to grow, with new attacks developed every year and many of them available in standard toolkits, such as TextAttack and OpenAttack. In response, there is a growing body of work on robust learning, which reduces vulnerability to these attacks, though sometimes at a high cost in compute time or accuracy. In this paper, we take an alternate approach -- we attempt to understand the attacker by analyzing adversarial text to determine which methods were used to create it. Our first contribution is an extensive dataset for attack detection and labeling: 1.5~million attack instances, generated by twelve adversarial attacks targeting three classifiers trained on six source datasets for sentiment analysis and abuse detection in English. As our second contribution, we use this dataset to develop and benchmark a number of classifiers for attack identification -- determining if a given text has been adversarially manipulated and by which attack. As a third contribution, we demonstrate the effectiveness of three classes of features for these tasks: text properties, capturing content and presentation of text; language model properties, determining which tokens are more or less probable throughout the input; and target model properties, representing how the text classifier is influenced by the attack, including internal node activations. Overall, this represents a first step towards forensics for adversarial attacks against text classifiers.

翻译：对文本分类者的对抗性攻击情况继续扩大,每年都有新的攻击,许多攻击都见诸于TextAttack和OpenAttack等标准工具包。作为回应,关于强力学习的工作越来越多,这降低了对这些攻击的脆弱性,尽管有时在计算时间或准确性方面成本很高。在本文件中,我们采取了另一种办法 -- -- 我们试图通过分析对抗性案文来理解攻击者,以确定使用何种方法来创建攻击者。我们的第一个贡献是为攻击探测和标注而建立的广泛数据集:12个以6个源数据集为对象的对立攻击造成150-百万个攻击事件,这些攻击事件是用英语进行情绪分析和虐待检测的。作为我们的第二个贡献,我们利用这一数据集来制定和确定攻击性识别的分类人员数量 -- -- 确定某一文本是否经过对抗性操纵,以及攻击性。作为第三项贡献,我们展示了这些任务的三个特征的有效性:文字属性、内容捕获和表述文本;语言模型属性,确定哪些标志在投入中多少可能是针对攻击的六个源数据集;作为整个攻击的标本的标尺,如何代表整个攻击的升级。

0

相关内容

可辨认的

近期必读的六篇AAAI 2021【对抗攻击（Adversarial Attack）】相关论文和代码

专知会员服务

55+阅读 · 2021年2月17日

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

专知会员服务

26+阅读 · 2020年7月24日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【领域对抗学习的低资源文本分类】Low-Resource Text Classification using Domain-Adversarial Learning

【领域对抗学习的低资源文本分类】Low-Resource Text Classification using Domain-Adversarial Learning

专知会员服务

23+阅读 · 2020年4月22日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【EMNLP2021教程】鲁棒自然语言处理，EMNLP 21 Tutorial on Robust NLP，176页pdf

【EMNLP2021教程】鲁棒自然语言处理，EMNLP 21 Tutorial on Robust NLP，176页pdf

专知

2+阅读 · 2021年11月12日

【论文】Awesome Relation Classification Paper（关系分类）（PART II）

【论文】Awesome Relation Classification Paper（关系分类）（PART II）

AINLP

15+阅读 · 2019年8月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

专知

20+阅读 · 2018年4月7日

胰腺癌分泌的exosomes教育BMDCs致预转移小生境在胰腺癌肝转移作用中的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

量子群与Tewilliger代数的相关问题研究

国家自然科学基金

1+阅读 · 2013年12月31日

面向智能电网基础设施Cyber-Physical安全的自治愈基础理论研究

国家自然科学基金

1+阅读 · 2013年12月31日

外生突发性冲击对宏观经济运行的影响及对策研究

国家自然科学基金

0+阅读 · 2013年12月31日

Erdos-Sos猜想及几个相关的极值组合问题

国家自然科学基金

0+阅读 · 2012年12月31日

基于Stackelberg博弈的我国温室气体减排对策研究

国家自然科学基金

0+阅读 · 2012年12月31日

自适应软件设计模式组合理论研究

国家自然科学基金

0+阅读 · 2012年12月31日

我国农产品市场减小价格扭曲的影响因素、路径选择及绩效

国家自然科学基金

0+阅读 · 2012年12月31日

纳米操作机器人同步触发机械门控离子通道

国家自然科学基金

0+阅读 · 2011年12月31日

智能电网信息系统的体系结构和验证环境

国家自然科学基金

1+阅读 · 2011年12月31日

Adversarial Scratches: Deployable Attacks to CNN Classifiers

Arxiv

0+阅读 · 2022年4月20日

Indiscriminate Data Poisoning Attacks on Neural Networks

Arxiv

0+阅读 · 2022年4月19日

Jacobian Ensembles Improve Robustness Trade-offs to Adversarial Attacks

Arxiv

0+阅读 · 2022年4月19日

Composite Adversarial Attacks

Arxiv

12+阅读 · 2020年12月10日

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Arxiv

17+阅读 · 2020年9月8日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

A Survey of Adversarial Learning on Graphs

Arxiv

38+阅读 · 2020年3月10日

Adversarial Attacks and Defenses in Images, Graphs and Text: A Review

Adversarial Attacks and Defenses in Images, Graphs and Text: A Review

Arxiv

17+阅读 · 2019年10月9日

Graph Convolutional Networks for Text Classification

Arxiv

12+阅读 · 2018年9月15日

Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

Arxiv

18+阅读 · 2018年1月5日

VIP会员

文章信息

相关主题

相关VIP内容

近期必读的六篇AAAI 2021【对抗攻击（Adversarial Attack）】相关论文和代码

专知会员服务

55+阅读 · 2021年2月17日

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

专知会员服务

26+阅读 · 2020年7月24日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【领域对抗学习的低资源文本分类】Low-Resource Text Classification using Domain-Adversarial Learning

【领域对抗学习的低资源文本分类】Low-Resource Text Classification using Domain-Adversarial Learning

专知会员服务

23+阅读 · 2020年4月22日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【EMNLP2021教程】鲁棒自然语言处理，EMNLP 21 Tutorial on Robust NLP，176页pdf

【EMNLP2021教程】鲁棒自然语言处理，EMNLP 21 Tutorial on Robust NLP，176页pdf

专知

2+阅读 · 2021年11月12日

【论文】Awesome Relation Classification Paper（关系分类）（PART II）

【论文】Awesome Relation Classification Paper（关系分类）（PART II）

AINLP

15+阅读 · 2019年8月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

专知

20+阅读 · 2018年4月7日

相关论文

Adversarial Scratches: Deployable Attacks to CNN Classifiers

Arxiv

0+阅读 · 2022年4月20日

Indiscriminate Data Poisoning Attacks on Neural Networks

Arxiv

0+阅读 · 2022年4月19日

Jacobian Ensembles Improve Robustness Trade-offs to Adversarial Attacks

Arxiv

0+阅读 · 2022年4月19日

Composite Adversarial Attacks

Arxiv

12+阅读 · 2020年12月10日

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Arxiv

17+阅读 · 2020年9月8日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

A Survey of Adversarial Learning on Graphs

Arxiv

38+阅读 · 2020年3月10日

Adversarial Attacks and Defenses in Images, Graphs and Text: A Review

Adversarial Attacks and Defenses in Images, Graphs and Text: A Review

Arxiv

17+阅读 · 2019年10月9日

Graph Convolutional Networks for Text Classification

Arxiv

12+阅读 · 2018年9月15日

Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

Arxiv

18+阅读 · 2018年1月5日

相关基金

胰腺癌分泌的exosomes教育BMDCs致预转移小生境在胰腺癌肝转移作用中的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

量子群与Tewilliger代数的相关问题研究

国家自然科学基金

1+阅读 · 2013年12月31日

面向智能电网基础设施Cyber-Physical安全的自治愈基础理论研究

国家自然科学基金

1+阅读 · 2013年12月31日

外生突发性冲击对宏观经济运行的影响及对策研究

国家自然科学基金

0+阅读 · 2013年12月31日

Erdos-Sos猜想及几个相关的极值组合问题

国家自然科学基金

0+阅读 · 2012年12月31日

基于Stackelberg博弈的我国温室气体减排对策研究

国家自然科学基金

0+阅读 · 2012年12月31日

自适应软件设计模式组合理论研究

国家自然科学基金

0+阅读 · 2012年12月31日

我国农产品市场减小价格扭曲的影响因素、路径选择及绩效

国家自然科学基金

0+阅读 · 2012年12月31日

纳米操作机器人同步触发机械门控离子通道

国家自然科学基金

0+阅读 · 2011年12月31日

智能电网信息系统的体系结构和验证环境

国家自然科学基金

1+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员