仅仅说不:分析在进攻性背景下神经对话世代的风气 (Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts) - 专知论文

会员服务 ·

0

任务对话系统 · MoDELS · 宏F1 · Better · 学成 ·

2021 年 9 月 13 日

Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts

翻译：仅仅说不:分析在进攻性背景下神经对话世代的风气

Ashutosh Baheti,Maarten Sap,Alan Ritter,Mark Riedl

from arxiv, Accepted at EMNLP 2021

Dialogue models trained on human conversations inadvertently learn to generate toxic responses. In addition to producing explicitly offensive utterances, these models can also implicitly insult a group or individual by aligning themselves with an offensive statement. To better understand the dynamics of contextually offensive language, we investigate the stance of dialogue model responses in offensive Reddit conversations. Specifically, we create ToxiChat, a crowd-annotated dataset of 2,000 Reddit threads and model responses labeled with offensive language and stance. Our analysis reveals that 42% of human responses agree with toxic comments, whereas only 13% agree with safe comments. This undesirable behavior is learned by neural dialogue models, such as DialoGPT, which we show are two times more likely to agree with offensive comments. To enable automatic detection of offensive language, we fine-tuned transformer-based classifiers on ToxiChat that achieve 0.71 F1 for offensive labels and 0.53 Macro-F1 for stance labels. Finally, we quantify the effectiveness of controllable text generation (CTG) methods to mitigate the tendency of neural dialogue models to agree with offensive comments. Compared to the baseline, our best CTG model achieves a 19% reduction in agreement with offensive comments and produces 29% fewer offensive replies. Our work highlights the need for further efforts to characterize and analyze inappropriate behavior in dialogue models, in order to help make them safer. Our code and corpus are available at https://github.com/abaheti95/ToxiChat .

翻译：人类谈话中经过培训的对话模型无意中无意中学会产生有毒反应。除了产生明显冒犯性言论外,这些模型还可以通过与攻击性言论保持一致来暗示侮辱一个群体或个人。为了更好地了解攻击性雷迪迪对话中的对话模式反应态势,我们调查了攻击性雷迪对话中的对话模式反应态势。具体地说,我们创建了ToxiChat,这是一个由2 000条红色线和标有攻击性语言和姿态的标注2 000条红色线和模型反应组成的众注数据集。我们的分析表明,42%的人反应与有毒评论一致,而只有13%的人反应与安全评论一致。通过神经对话模型(如Dial-C-GPT)来学习这种不良行为,我们显示,这比攻击性评论的可能性高两倍。为了能够自动检测攻击性语言,我们精细调整了以变压式变压器为基础的托西Chat变压器分类,在攻击性标签中实现了0.71 F1和0.53 宏观-F1。最后,我们量化了可调制文本生成方法的有效性,以降低神经对话模式与攻击性评论的倾向。比较,我们的最佳C-CT-95的模型与攻击性对话模型比重。我们更接近了19的排序,在对话模型中,在分析工作上更需要调整了19的排序。

0

相关内容

任务对话系统

任务对话系统

自然语言生成综述

专知会员服务

65+阅读 · 2021年5月29日

哈工大SCIR 14篇长文被ACL 2021主会/Findings和IJCAI 2021录用

专知会员服务

56+阅读 · 2021年5月10日

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

专知会员服务

15+阅读 · 2020年5月5日

【哈工大】基于文档的对话系统(DGDS)综述，A Survey of Document Grounded Dialogue Systems (DGDS)

【哈工大】基于文档的对话系统(DGDS)综述，A Survey of Document Grounded Dialogue Systems (DGDS)

专知会员服务

35+阅读 · 2020年4月30日

【东大-UCSB】虚假新闻检测的自然语言处理研究综述，A Survey on Natural Language Processing for Fake News Detection

【东大-UCSB】虚假新闻检测的自然语言处理研究综述，A Survey on Natural Language Processing for Fake News Detection

专知会员服务

79+阅读 · 2020年2月12日

【NUS】神经问题生成的最近进展（Recent Advances in Neural Question Generation）

【NUS】神经问题生成的最近进展（Recent Advances in Neural Question Generation）

专知会员服务

16+阅读 · 2019年12月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

已删除

将门创投

9+阅读 · 2018年12月19日

推荐｜深度强化学习聊天机器人（附论文）！

推荐｜深度强化学习聊天机器人（附论文）！

全球人工智能

4+阅读 · 2018年1月30日

Automatic Evaluation and Moderation of Open-domain Dialogue Systems

Arxiv

0+阅读 · 2021年11月3日

Sequence-to-Sequence Learning with Latent Neural Grammars

Arxiv

0+阅读 · 2021年11月2日

ASMDD: Arabic Speech Mispronunciation Detection Dataset

Arxiv

0+阅读 · 2021年11月1日

Reverse Engineering Configurations of Neural Text Generation Models

Arxiv

5+阅读 · 2020年4月13日

Few-shot Natural Language Generation for Task-Oriented Dialog

Few-shot Natural Language Generation for Task-Oriented Dialog

Arxiv

30+阅读 · 2020年2月27日

Neural Response Generation with Meta-Words

Neural Response Generation with Meta-Words

Arxiv

6+阅读 · 2019年6月14日

Pre-trained Language Model Representations for Language Generation

Arxiv

5+阅读 · 2019年4月1日

Deep Graph Convolutional Encoders for Structured Data to Text Generation

Arxiv

6+阅读 · 2018年10月23日

Between an Arena and a Sports Bar: Online Chats of eSports Spectators

Arxiv

4+阅读 · 2018年1月9日

Neural Response Generation with Dynamic Vocabularies

Arxiv

5+阅读 · 2017年11月30日

VIP会员

文章信息

相关主题

任务对话系统

相关VIP内容

自然语言生成综述

专知会员服务

65+阅读 · 2021年5月29日

哈工大SCIR 14篇长文被ACL 2021主会/Findings和IJCAI 2021录用

专知会员服务

56+阅读 · 2021年5月10日

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

专知会员服务

15+阅读 · 2020年5月5日

【哈工大】基于文档的对话系统(DGDS)综述，A Survey of Document Grounded Dialogue Systems (DGDS)

【哈工大】基于文档的对话系统(DGDS)综述，A Survey of Document Grounded Dialogue Systems (DGDS)

专知会员服务

35+阅读 · 2020年4月30日

【东大-UCSB】虚假新闻检测的自然语言处理研究综述，A Survey on Natural Language Processing for Fake News Detection

【东大-UCSB】虚假新闻检测的自然语言处理研究综述，A Survey on Natural Language Processing for Fake News Detection

专知会员服务

79+阅读 · 2020年2月12日

【NUS】神经问题生成的最近进展（Recent Advances in Neural Question Generation）

【NUS】神经问题生成的最近进展（Recent Advances in Neural Question Generation）

专知会员服务

16+阅读 · 2019年12月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《多智能体不确定环境追逃博弈研究》216页

美智库最新发布《解放军"人机编组协同作战"发展路径：理论与实践》53页

现代战争"杀伤区"理论：空间尺度与结构特征、控制手段与毁伤机制、生存策略与战线转移

《俄军无人机创新技术或已在乌克兰达成"战场空中封锁"作战效果》最新18页报告

相关资讯

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

已删除

将门创投

9+阅读 · 2018年12月19日

推荐｜深度强化学习聊天机器人（附论文）！

推荐｜深度强化学习聊天机器人（附论文）！

全球人工智能

4+阅读 · 2018年1月30日

相关论文

Automatic Evaluation and Moderation of Open-domain Dialogue Systems

Arxiv

0+阅读 · 2021年11月3日

Sequence-to-Sequence Learning with Latent Neural Grammars

Arxiv

0+阅读 · 2021年11月2日

ASMDD: Arabic Speech Mispronunciation Detection Dataset

Arxiv

0+阅读 · 2021年11月1日

Reverse Engineering Configurations of Neural Text Generation Models

Arxiv

5+阅读 · 2020年4月13日

Few-shot Natural Language Generation for Task-Oriented Dialog

Few-shot Natural Language Generation for Task-Oriented Dialog

Arxiv

30+阅读 · 2020年2月27日

Neural Response Generation with Meta-Words

Neural Response Generation with Meta-Words

Arxiv

6+阅读 · 2019年6月14日

Pre-trained Language Model Representations for Language Generation

Arxiv

5+阅读 · 2019年4月1日

Deep Graph Convolutional Encoders for Structured Data to Text Generation

Arxiv

6+阅读 · 2018年10月23日

Between an Arena and a Sports Bar: Online Chats of eSports Spectators

Arxiv

4+阅读 · 2018年1月9日

Neural Response Generation with Dynamic Vocabularies

Arxiv

5+阅读 · 2017年11月30日

微信扫码咨询专知VIP会员