【导读】深度网络在精心设计的对抗攻击样例面前是非常脆弱的,这使得研究对抗攻击与防御的工作,特别是研究CV上的对抗攻击和防御的工作如雨后春笋,节节攀升。显然,对抗攻击这个事,在使用深度网络的所有方向上都会有,例如:图像、文本、语音等等。NLP上的对抗攻击和防御,也顺势而起,虽然目前的相关工作,只有近期零星的几篇,但是已经引起了NLP社区的浓厚的研究兴趣。前段时间,清华NLP组的同学们,在GitHub上开源了一个list,名为:文本对抗攻击与防御必读paper列表,相信对文本对抗攻击和防御感兴趣的同学,看完列表的文章,一定会有所收获。
GitHub库:
TAADpapers
GitHub地址:
https://github.com/thunlp/TAADpapers
作者:
Chenghao Yang, Fanchao Qi and Yuan Zang
题目:
必读的文本对抗攻击与防御文章
目录:
Section | Description |
---|---|
Survey | Survey papers on Textual Attack and Defense |
Black-box Only | Attack generators only have access to confidence of victim models |
White-box Only | Attack generators have full access to victim models |
Both | Papers work on both black-box and white-box setting |
Defense Only | Papers work on defense |
Evaluation | Papers propose new evaluations of textual attacks and defense |
Application of TAAD in Other Fields | Papers apply TAAD in other fields except Natural Language Processing (NLP) |
Analysis Methods in Neural Language Processing: A Survey. Yonatan Belinkov, James Glass. TACL 2019. [pdf]
Towards a Robust Deep Neural Network in Text Domain A Survey. Wenqi Wang, Lina Wang, Benxiao Tang, Run Wang, Aoshuang Ye. 2019. [pdf]
Adversarial Attacks on Deep Learning Models in Natural Language Processing: A Survey. Wei Emma Zhang, Quan Z. Sheng, Ahoud Alhazmi, Chenliang Li. 2019. [pdf]
PAWS: Paraphrase Adversaries from Word Scrambling. Yuan Zhang, Jason Baldridge, Luheng He. NAACL-HLT 2019. [pdf] [dataset]
Text Processing Like Humans Do: Visually Attacking and Shielding NLP Systems. Steffen Eger, Gözde Gül ¸Sahin, Andreas Rücklé, Ji-Ung Lee, Claudia Schulz, Mohsen Mesgar, Krishnkant Swarnkar, Edwin Simpson, Iryna Gurevych.NAACL-HLT 2019. [pdf] [code&data]
Adversarial Over-Sensitivity and Over-Stability Strategies for Dialogue Models. Tong Niu, Mohit Bansal. CoNLL 2018. [pdf] [code&data]
Adversarially Regularising Neural NLI Models to Integrate Logical Background Knowledge. Pasquale Minervini, Sebastian Riedel. CoNLL 2018. [pdf] [code&data]
Generating Natural Language Adversarial Examples. Moustafa Alzantot, Yash Sharma, Ahmed Elgohary, Bo-Jhang Ho, Mani Srivastava, Kai-Wei Chang. EMNLP 2018. [pdf] [code]
Breaking NLI Systems with Sentences that Require Simple Lexical Inferences. Max Glockner, Vered Shwartz, Yoav Goldberg ACL 2018. [pdf] [dataset]
AdvEntuRe: Adversarial Training for Textual Entailment with Knowledge-Guided Examples. Dongyeop Kang, Tushar Khot, Ashish Sabharwal, Eduard Hovy. ACL 2018. [pdf] [code]
Semantically Equivalent Adversarial Rules for Debugging NLP Models. Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin ACL 2018. [pdf] [code]
Robust Machine Comprehension Models via Adversarial Training. Yicheng Wang, Mohit Bansal. NAACL-HLT 2018. [pdf] [dataset]
Adversarial Example Generation with Syntactically Controlled Paraphrase Networks. Mohit Iyyer, John Wieting, Kevin Gimpel, Luke Zettlemoyer. NAACL-HLT 2018. [pdf] [code&data]
Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers. Ji Gao, Jack Lanchantin, Mary Lou Soffa, Yanjun Qi. IEEE SPW 2018. [pdf] [code]
Synthetic and Natural Noise Both Break Neural Machine Translation. Yonatan Belinkov, Yonatan Bisk. ICLR 2018. [pdf] [code&data]
Generating Natural Adversarial Examples. Zhengli Zhao, Dheeru Dua, Sameer Singh. ICLR 2018. [pdf] [code]
Adversarial Examples for Evaluating Reading Comprehension Systems. Robin Jia, and Percy Liang. EMNLP 2017. [pdf] [code]
Adversarial Sets for Regularising Neural Link Predictors. Pasquale Minervini, Thomas Demeester, Tim Rocktäschel, Sebastian Riedel. UAI 2017 [pdf] [code]
On Adversarial Examples for Character-Level Neural Machine Translation. Javid Ebrahimi, Daniel Lowd, Dejing Dou. COLING 2018. [pdf] [code]
HotFlip: White-Box Adversarial Examples for Text Classification. Javid Ebrahimi, Anyi Rao, Daniel Lowd, Dejing Dou. ACL 2018. [pdf] [code]
Towards Crafting Text Adversarial Samples. Suranjana Samanta, Sameep Mehta. ECIR 2018. [pdf]
TEXTBUGGER: Generating Adversarial Text Against Real-world Applications. Jinfeng Li, Shouling Ji, Tianyu Du, Bo Li, Ting Wang. NDSS 2019. [pdf]
Comparing Attention-based Convolutional and Recurrent Neural Networks: Success and Limitations in Machine Reading Comprehension. Matthias Blohm, Glorianna Jagfeld, Ekta Sood, Xiang Yu, Ngoc Thang Vu. CoNLL 2018. [pdf]
Deep Text Classification Can be Fooled. Bin Liang, Hongcheng Li, Miaoqiang Su, Pan Bian, Xirong Li, Wenchang Shi.IJCAI 2018. [pdf]
Combating Adversarial Misspellings with Robust Word Recognition. Danish Pruthi, Bhuwan Dhingra, Zachary C. Lipton. ACL 2019. [pdf] [code]
On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models. Paul Michel, Xian Li, Graham Neubig, Juan Miguel Pino. NAACL-HLT 2019. [pdf] [code]
Unified Visual-Semantic Embeddings: Bridging Vision and Language with Structured Meaning Representations.Hao Wu, Jiayuan Mao, Yufeng Zhang, Yuning Jiang, Lei Li, Weiwei Sun, Wei-Ying Ma. CVPR 2019. [pdf]
Learning Visually-Grounded Semantics from Contrastive Adversarial Samples. Haoyue Shi, Jiayuan Mao, Tete Xiao, Yuning Jiang, Jian Sun. COLING 2018. [pdf] [code]
-END-
专 · 知
专知,专业可信的人工智能知识分发,让认知协作更快更好!欢迎登录www.zhuanzhi.ai,注册登录专知,获取更多AI知识资料!
欢迎微信扫一扫加入专知人工智能知识星球群,获取最新AI专业干货知识教程视频资料和与专家交流咨询!
请加专知小助手微信(扫一扫如下二维码添加),加入专知人工智能主题群,咨询技术商务合作~
专知《深度学习:算法到实战》课程全部完成!550+位同学在学习,现在报名,限时优惠!网易云课堂人工智能畅销榜首位!
点击“阅读原文”,了解报名专知《深度学习:算法到实战》课程