White-Box Multi-Objective Adversarial Attack on Dialogue Generation - 专知论文

会员服务 ·

0

任务对话系统 · 白盒 · CRAFT · state-of-the-art · MoDELS ·

2023 年 5 月 8 日

White-Box Multi-Objective Adversarial Attack on Dialogue Generation

翻译：暂无翻译

Yufei Li,Zexin Li,Yingfan Gao,Cong Liu

from arxiv, ACL 2023 main conference long paper

Pre-trained transformers are popular in state-of-the-art dialogue generation (DG) systems. Such language models are, however, vulnerable to various adversarial samples as studied in traditional tasks such as text classification, which inspires our curiosity about their robustness in DG systems. One main challenge of attacking DG models is that perturbations on the current sentence can hardly degrade the response accuracy because the unchanged chat histories are also considered for decision-making. Instead of merely pursuing pitfalls of performance metrics such as BLEU, ROUGE, we observe that crafting adversarial samples to force longer generation outputs benefits attack effectiveness -- the generated responses are typically irrelevant, lengthy, and repetitive. To this end, we propose a white-box multi-objective attack method called DGSlow. Specifically, DGSlow balances two objectives -- generation accuracy and length, via a gradient-based multi-objective optimizer and applies an adaptive searching mechanism to iteratively craft adversarial samples with only a few modifications. Comprehensive experiments on four benchmark datasets demonstrate that DGSlow could significantly degrade state-of-the-art DG models with a higher success rate than traditional accuracy-based methods. Besides, our crafted sentences also exhibit strong transferability in attacking other models.

翻译：暂无翻译

0

相关内容

任务对话系统

任务对话系统

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

【CVPR2020-国科大】状态标签对抗主动学习，Adversarial Active Learning

【CVPR2020-国科大】状态标签对抗主动学习，Adversarial Active Learning

专知会员服务

48+阅读 · 2020年4月13日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

MARVELD1基因调控肝细胞癌介入治疗的机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

短时空PS InSAR形变模型构建与稳健解算方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Lee偏差在试验设计中的应用研究

国家自然科学基金

0+阅读 · 2013年12月31日

选择性靶向拓扑异构酶IIa的ATPase位点抑制剂：新型曼宋酮类化合物设计、合成和机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

特异识别PSMA的三聚体纳米粒定向rAAV经UTMD多重靶向治疗前列腺癌的实验研究

国家自然科学基金

0+阅读 · 2011年12月31日

Sample Attackability in Natural Language Adversarial Attacks

Arxiv

0+阅读 · 2023年6月21日

Adversarial Robustness of Representation Learning for Knowledge Graphs

Arxiv

10+阅读 · 2022年9月30日

Composite Adversarial Attacks

Arxiv

12+阅读 · 2020年12月10日

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Arxiv

17+阅读 · 2020年9月8日

A Survey of Adversarial Learning on Graphs

Arxiv

38+阅读 · 2020年3月10日

VIP会员

文章信息

相关主题

任务对话系统

state-of-the-art

相关VIP内容

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

【CVPR2020-国科大】状态标签对抗主动学习，Adversarial Active Learning

【CVPR2020-国科大】状态标签对抗主动学习，Adversarial Active Learning

专知会员服务

48+阅读 · 2020年4月13日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【斯坦福博士论文】数据、决策与过度依赖：构建可信人工智能的核心挑战

《多域时代中维持弹性军事训练：挑战与机遇》

【AAAI2026】专家数量何为最优？面向混合专家模型的语义专业化优化研究

自进化人工智能体的全面综述：连接基础模型与终身自主智能系统的新范式

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Sample Attackability in Natural Language Adversarial Attacks

Arxiv

0+阅读 · 2023年6月21日

Adversarial Robustness of Representation Learning for Knowledge Graphs

Arxiv

10+阅读 · 2022年9月30日

Composite Adversarial Attacks

Arxiv

12+阅读 · 2020年12月10日

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Arxiv

17+阅读 · 2020年9月8日

A Survey of Adversarial Learning on Graphs

Arxiv

38+阅读 · 2020年3月10日

相关基金

MARVELD1基因调控肝细胞癌介入治疗的机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

短时空PS InSAR形变模型构建与稳健解算方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Lee偏差在试验设计中的应用研究

国家自然科学基金

0+阅读 · 2013年12月31日

选择性靶向拓扑异构酶IIa的ATPase位点抑制剂：新型曼宋酮类化合物设计、合成和机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

特异识别PSMA的三聚体纳米粒定向rAAV经UTMD多重靶向治疗前列腺癌的实验研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员