MDD-Eval:多领域对话评价强化数据自我培训 (MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue Evaluation) - 专知论文

会员服务 ·

0

任务对话系统 · state-of-the-art · Extensibility · Performer · 相关系数 ·

2022 年 1 月 16 日

MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue Evaluation

翻译：MDD-Eval:多领域对话评价强化数据自我培训

Chen Zhang,Luis Fernando D'Haro,Thomas Friedrichs,Haizhou Li

from arxiv, AAAI-2022 Preprint (corrected the missing citation issue.)

Chatbots are designed to carry out human-like conversations across different domains, such as general chit-chat, knowledge exchange, and persona-grounded conversations. To measure the quality of such conversational agents, a dialogue evaluator is expected to conduct assessment across domains as well. However, most of the state-of-the-art automatic dialogue evaluation metrics (ADMs) are not designed for multi-domain evaluation. We are motivated to design a general and robust framework, MDD-Eval, to address the problem. Specifically, we first train a teacher evaluator with human-annotated data to acquire a rating skill to tell good dialogue responses from bad ones in a particular domain and then, adopt a self-training strategy to train a new evaluator with teacher-annotated multi-domain data, that helps the new evaluator to generalize across multiple domains. MDD-Eval is extensively assessed on six dialogue evaluation benchmarks. Empirical results show that the MDD-Eval framework achieves a strong performance with an absolute improvement of 7% over the state-of-the-art ADMs in terms of mean Spearman correlation scores across all the evaluation benchmarks.

翻译：聊天室的设计是为了在不同领域进行人性化对话,例如普通聊天、知识交流和人造对话。为了衡量这种对话媒介的质量,对话评价员预期也将进行跨领域的评估。然而,大多数最先进的自动对话评价指标(ADMs)并不是为多领域评价设计的。我们有志于设计一个普遍和健全的框架(MDD-Eval)来解决这个问题。具体地说,我们首先培训一名具有人称数据教师评价员,以获得在特定领域告诉坏对话反应的评级技能,然后采取自我培训战略,用教师附加说明的多领域数据培训新的评价员,帮助新的评价员在多个领域进行普及。MDD-Eval在六个对话评价基准上得到了广泛评估。经验性结果显示,MDDD-Eval框架取得了强有力的业绩,在所有评价基准中,Spearman相关性得分数方面,比国家数据库的7%的绝对改进。

0

相关内容

任务对话系统

任务对话系统

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

20+阅读 · 2022年3月18日

【因果人工智能系统】106页ppt，Causal AI for Systems

专知会员服务

91+阅读 · 2021年8月28日

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

专知会员服务

64+阅读 · 2020年5月12日

对话管理的综述论文:最近的进展和挑战，A Survey on Dialog Management: Recent Advances and Challenges

对话管理的综述论文:最近的进展和挑战，A Survey on Dialog Management: Recent Advances and Challenges

专知会员服务

80+阅读 · 2020年5月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

161+阅读 · 2020年3月18日

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

专知会员服务

42+阅读 · 2019年11月25日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

167+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

64+阅读 · 2019年10月9日

机器学习相关资源(框架、库、软件)大列表

机器学习相关资源(框架、库、软件)大列表

专知会员服务

37+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

130+阅读 · 2020年3月18日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

19+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

AINLP

10+阅读 · 2019年2月9日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

25+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

脑电信号中肌电噪声去除的新探索

国家自然科学基金

1+阅读 · 2015年12月31日

面向广义能量效率的数控加工工艺规划理论与方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

固定/移动/混合RFID阅读器防碰撞智能规划方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于高分辨率遥感影像的城市社区尺度的收入水平估算方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于CT影像的肺结节计算机辅助诊断方法及关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向属性的CPN建模及On the Fly辅助的测试生成方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

关系的分解与Domain的表示

国家自然科学基金

1+阅读 · 2011年12月31日

基于电容感应技术的海冰厚度监测方法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

Unscented卡尔曼滤波算法及其在通信中的应用

国家自然科学基金

0+阅读 · 2008年12月31日

高通量基因数据分析中的 Bayes 统计方法

国家自然科学基金

1+阅读 · 2008年12月31日

Exploring Dense Retrieval for Dialogue Response Selection

Arxiv

0+阅读 · 2022年4月20日

ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models

Arxiv

0+阅读 · 2022年4月19日

Empirical Evaluation and Theoretical Analysis for Representation Learning: A Survey

Arxiv

0+阅读 · 2022年4月18日

Evaluation Benchmarks for Spanish Sentence Representations

Arxiv

0+阅读 · 2022年4月15日

Segmenting across places: The need for fair transfer learning with satellite imagery

Arxiv

0+阅读 · 2022年4月15日

Cross-Domain Adaptive Clustering for Semi-Supervised Domain Adaptation

Cross-Domain Adaptive Clustering for Semi-Supervised Domain Adaptation

Arxiv

18+阅读 · 2021年4月19日

Open Domain Generalization with Domain-Augmented Meta-Learning

Arxiv

20+阅读 · 2021年4月8日

Few-shot Natural Language Generation for Task-Oriented Dialog

Few-shot Natural Language Generation for Task-Oriented Dialog

Arxiv

30+阅读 · 2020年2月27日

Memory Augmented Graph Neural Networks for Sequential Recommendation

Memory Augmented Graph Neural Networks for Sequential Recommendation

Arxiv

13+阅读 · 2019年12月26日

CoNet: Collaborative Cross Networks for Cross-Domain Recommendation

Arxiv

12+阅读 · 2018年4月20日

VIP会员

文章信息

相关主题

任务对话系统

state-of-the-art

相关VIP内容

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

20+阅读 · 2022年3月18日

【因果人工智能系统】106页ppt，Causal AI for Systems

专知会员服务

91+阅读 · 2021年8月28日

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

专知会员服务

64+阅读 · 2020年5月12日

对话管理的综述论文:最近的进展和挑战，A Survey on Dialog Management: Recent Advances and Challenges

对话管理的综述论文:最近的进展和挑战，A Survey on Dialog Management: Recent Advances and Challenges

专知会员服务

80+阅读 · 2020年5月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

161+阅读 · 2020年3月18日

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

专知会员服务

42+阅读 · 2019年11月25日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

167+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

64+阅读 · 2019年10月9日

机器学习相关资源(框架、库、软件)大列表

机器学习相关资源(框架、库、软件)大列表

专知会员服务

37+阅读 · 2019年10月9日

热门VIP内容

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

130+阅读 · 2020年3月18日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

19+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

AINLP

10+阅读 · 2019年2月9日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

25+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

相关论文

Exploring Dense Retrieval for Dialogue Response Selection

Arxiv

0+阅读 · 2022年4月20日

ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models

Arxiv

0+阅读 · 2022年4月19日

Empirical Evaluation and Theoretical Analysis for Representation Learning: A Survey

Arxiv

0+阅读 · 2022年4月18日

Evaluation Benchmarks for Spanish Sentence Representations

Arxiv

0+阅读 · 2022年4月15日

Segmenting across places: The need for fair transfer learning with satellite imagery

Arxiv

0+阅读 · 2022年4月15日

Cross-Domain Adaptive Clustering for Semi-Supervised Domain Adaptation

Cross-Domain Adaptive Clustering for Semi-Supervised Domain Adaptation

Arxiv

18+阅读 · 2021年4月19日

Open Domain Generalization with Domain-Augmented Meta-Learning

Arxiv

20+阅读 · 2021年4月8日

Few-shot Natural Language Generation for Task-Oriented Dialog

Few-shot Natural Language Generation for Task-Oriented Dialog

Arxiv

30+阅读 · 2020年2月27日

Memory Augmented Graph Neural Networks for Sequential Recommendation

Memory Augmented Graph Neural Networks for Sequential Recommendation

Arxiv

13+阅读 · 2019年12月26日

CoNet: Collaborative Cross Networks for Cross-Domain Recommendation

Arxiv

12+阅读 · 2018年4月20日

相关基金

脑电信号中肌电噪声去除的新探索

国家自然科学基金

1+阅读 · 2015年12月31日

面向广义能量效率的数控加工工艺规划理论与方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

固定/移动/混合RFID阅读器防碰撞智能规划方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于高分辨率遥感影像的城市社区尺度的收入水平估算方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于CT影像的肺结节计算机辅助诊断方法及关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向属性的CPN建模及On the Fly辅助的测试生成方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

关系的分解与Domain的表示

国家自然科学基金

1+阅读 · 2011年12月31日

基于电容感应技术的海冰厚度监测方法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

Unscented卡尔曼滤波算法及其在通信中的应用

国家自然科学基金

0+阅读 · 2008年12月31日

高通量基因数据分析中的 Bayes 统计方法

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员