同步:统一转折点和对话层面评价 (DynaEval: Unifying Turn and Dialogue Level Evaluation) - 专知论文

会员服务 ·

0

任务对话系统 · 图卷积神经网络/图卷积网络 · INTERACT · 图 · Performer ·

2021 年 6 月 6 日

DynaEval: Unifying Turn and Dialogue Level Evaluation

翻译：同步:统一转折点和对话层面评价

Chen Zhang,Yiming Chen,Luis Fernando D'Haro,Yan Zhang,Thomas Friedrichs,Grandee Lee,Haizhou Li

from arxiv, ACL-IJCNLP 2021 (Main conference, Long paper)

A dialogue is essentially a multi-turn interaction among interlocutors. Effective evaluation metrics should reflect the dynamics of such interaction. Existing automatic metrics are focused very much on the turn-level quality, while ignoring such dynamics. To this end, we propose DynaEval, a unified automatic evaluation framework which is not only capable of performing turn-level evaluation, but also holistically considers the quality of the entire dialogue. In DynaEval, the graph convolutional network (GCN) is adopted to model a dialogue in totality, where the graph nodes denote each individual utterance and the edges represent the dependency between pairs of utterances. A contrastive loss is then applied to distinguish well-formed dialogues from carefully constructed negative samples. Experiments show that DynaEval significantly outperforms the state-of-the-art dialogue coherence model, and correlates strongly with human judgements across multiple dialogue evaluation aspects at both turn and dialogue level.

翻译：有效的评价指标应该反映这种互动的动态。现有的自动衡量指标非常侧重于转折点质量,而忽略了这种动态。为此,我们提议DynaEval,这是一个统一的自动评价框架,不仅能够进行转折点评价,而且能够通盘考虑整个对话的质量。在DynaEval,图形共变网络(GCN)被采用来模拟全面的对话,其中图形节点代表了每个个人的话语和边缘代表了两种话语之间的依赖性。随后将对比性损失用于区分完善的对话和精心构建的负面样本。实验显示,DynaEval大大超越了最先进的对话一致性模式,与人对交替和对话层面多个对话评价方面的判断密切相关。

0

相关内容

任务对话系统

任务对话系统

ICML 2021论文收录

ICML 2021论文收录

专知会员服务

123+阅读 · 2021年5月8日

多标签学习的新趋势（2020 Survey）

多标签学习的新趋势（2020 Survey）

专知会员服务

44+阅读 · 2020年12月6日

自然语言处理顶会EMNLP2020接受论文列表，754篇论文都在这儿了！

自然语言处理顶会EMNLP2020接受论文列表，754篇论文都在这儿了！

专知会员服务

28+阅读 · 2020年10月26日

神经网络与形式语言综述，12页pdf，A Survey of Neural Networks and Formal Languages

神经网络与形式语言综述，12页pdf，A Survey of Neural Networks and Formal Languages

专知会员服务

21+阅读 · 2020年6月4日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【综述】文献级机器翻译研究:方法与评价（A Survey on Document-level Machine Translation: Methods and Evaluation）

【综述】文献级机器翻译研究:方法与评价（A Survey on Document-level Machine Translation: Methods and Evaluation）

专知会员服务

7+阅读 · 2019年12月19日

【AAAI2020接受论文】预测性参与:开放领域对话系统自动评估的有效指标（Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems）

【AAAI2020接受论文】预测性参与:开放领域对话系统自动评估的有效指标（Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems）

专知会员服务

14+阅读 · 2019年11月15日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

已删除

将门创投

5+阅读 · 2018年7月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

Investigating Text Simplification Evaluation

Arxiv

0+阅读 · 2021年7月28日

Towards Emotion-Aware Agents For Negotiation Dialogues

Arxiv

0+阅读 · 2021年7月28日

Recent Advances in Deep Learning-based Dialogue Systems

Arxiv

18+阅读 · 2021年5月10日

Unifying Online and Counterfactual Learning to Rank

Arxiv

6+阅读 · 2020年12月8日

Few-shot Learning for Multi-label Intent Detection

Arxiv

21+阅读 · 2020年10月11日

Evaluating Multimodal Representations on Visual Semantic Textual Similarity

Evaluating Multimodal Representations on Visual Semantic Textual Similarity

Arxiv

6+阅读 · 2020年4月4日

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Arxiv

11+阅读 · 2019年11月4日

Learning to Detect Opinion Snippet for Aspect-Based Sentiment Analysis

Arxiv

6+阅读 · 2019年9月25日

Deep Reinforcement Learning in Ice Hockey for Context-Aware Player Evaluation

Deep Reinforcement Learning in Ice Hockey for Context-Aware Player Evaluation

Arxiv

5+阅读 · 2018年7月11日

Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation

Arxiv

5+阅读 · 2018年4月16日

VIP会员

文章信息

相关主题

任务对话系统

图卷积神经网络/图卷积网络

相关VIP内容

ICML 2021论文收录

ICML 2021论文收录

专知会员服务

123+阅读 · 2021年5月8日

多标签学习的新趋势（2020 Survey）

多标签学习的新趋势（2020 Survey）

专知会员服务

44+阅读 · 2020年12月6日

自然语言处理顶会EMNLP2020接受论文列表，754篇论文都在这儿了！

自然语言处理顶会EMNLP2020接受论文列表，754篇论文都在这儿了！

专知会员服务

28+阅读 · 2020年10月26日

神经网络与形式语言综述，12页pdf，A Survey of Neural Networks and Formal Languages

神经网络与形式语言综述，12页pdf，A Survey of Neural Networks and Formal Languages

专知会员服务

21+阅读 · 2020年6月4日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【综述】文献级机器翻译研究:方法与评价（A Survey on Document-level Machine Translation: Methods and Evaluation）

【综述】文献级机器翻译研究:方法与评价（A Survey on Document-level Machine Translation: Methods and Evaluation）

专知会员服务

7+阅读 · 2019年12月19日

【AAAI2020接受论文】预测性参与:开放领域对话系统自动评估的有效指标（Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems）

【AAAI2020接受论文】预测性参与:开放领域对话系统自动评估的有效指标（Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems）

专知会员服务

14+阅读 · 2019年11月15日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《用于提升多域战备的大型语言模型辅助场景生成器》报告

《通过适应复杂环境与特种作战动态变革情报周期》报告

国防领域人工智能规模化应用的理论与实践

《多域作战背景下集体防御作战规划流程的建模与仿真、兵棋推演及人工智能方法》

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

已删除

将门创投

5+阅读 · 2018年7月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

相关论文

Investigating Text Simplification Evaluation

Arxiv

0+阅读 · 2021年7月28日

Towards Emotion-Aware Agents For Negotiation Dialogues

Arxiv

0+阅读 · 2021年7月28日

Recent Advances in Deep Learning-based Dialogue Systems

Arxiv

18+阅读 · 2021年5月10日

Unifying Online and Counterfactual Learning to Rank

Arxiv

6+阅读 · 2020年12月8日

Few-shot Learning for Multi-label Intent Detection

Arxiv

21+阅读 · 2020年10月11日

Evaluating Multimodal Representations on Visual Semantic Textual Similarity

Evaluating Multimodal Representations on Visual Semantic Textual Similarity

Arxiv

6+阅读 · 2020年4月4日

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Arxiv

11+阅读 · 2019年11月4日

Learning to Detect Opinion Snippet for Aspect-Based Sentiment Analysis

Arxiv

6+阅读 · 2019年9月25日

Deep Reinforcement Learning in Ice Hockey for Context-Aware Player Evaluation

Deep Reinforcement Learning in Ice Hockey for Context-Aware Player Evaluation

Arxiv

5+阅读 · 2018年7月11日

Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation

Arxiv

5+阅读 · 2018年4月16日

微信扫码咨询专知VIP会员