【AAAI2020接受论文】预测性参与:开放领域对话系统自动评估的有效指标（Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems） - 专知VIP

会员服务 ·

0

机器学习 · 模型评估 · 估计预测 · 深度学习 · Sarik Ghazarian ·

2019 年 11 月 15 日

【AAAI2020接受论文】预测性参与:开放领域对话系统自动评估的有效指标（Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems）

专知会员服务

专知，提供专业可信的知识分发服务，让认知协作更快更好！

论文题目： Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

论文作者： Sarik Ghazarian, Ralph Weischedel, Aram Galstyan, Nanyun Peng

论文摘要： 用户参与度是评估开放域对话系统质量的关键指标。通过使用启发式构造的功能（例如转数和对话的总时间），先前的工作集中在对话级别的参与上。在本文中，我们调查了估计话语级别参与度的可能性和有效性，并定义了一种用于自动评估开放域对话系统的新指标，预测性参与度。我们的实验表明：（1）人类注释者在评估话语水平的参与分数方面具有很高的一致性；（2）对话级别的参与度得分可以根据适当汇总的话语级别的参与度得分进行预测。此外，我们表明可以从数据中学到话语水平的参与度分数。这些分数可以改善开放域对话系统的自动评估指标，如与人类判断的相关性所示。这表明预测性参与可以用作实时反馈，以训练更好的对话模型。

成为VIP会员查看完整内容

14

相关内容

机器学习

“机器学习是近20多年兴起的一门多领域交叉学科，涉及概率论、统计学、逼近论、凸分析、算法复杂度理论等多门学科。机器学习理论主要是设计和分析一些让可以自动“ 学习”的算法。机器学习算法是一类从数据中自动分析获得规律，并利用规律对未知数据进行预测的算法。因为学习算法中涉及了大量的统计学理论，机器学习与统计推断学联系尤为密切，也被称为统计学习理论。算法设计方面，机器学习理论关注可以实现的，行之有效的学习算法。很多推论问题属于无程序可循难度，所以部分的机器学习研究是开发容易处理的近似算法。” ——中文维基百科

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

专知会员服务

65+阅读 · 2020年5月12日

【论文翻译】2020最新预训练语言模型综述：Pre-trained Models for Natural Language Processing: A Survey

【论文翻译】2020最新预训练语言模型综述：Pre-trained Models for Natural Language Processing: A Survey

专知会员服务

94+阅读 · 2020年4月13日

【ACL2020-浙大-微软】多轮对话推理数据集，MuTual: A Dataset for Multi-Turn Dialogue Reasoning

【ACL2020-浙大-微软】多轮对话推理数据集，MuTual: A Dataset for Multi-Turn Dialogue Reasoning

专知会员服务

37+阅读 · 2020年4月10日

【清华大学】面向任务的对话系统的最新进展和挑战，Task-oriented Dialog System

【清华大学】面向任务的对话系统的最新进展和挑战，Task-oriented Dialog System

专知会员服务

84+阅读 · 2020年3月24日

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

专知会员服务

33+阅读 · 2020年3月23日

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

专知会员服务

33+阅读 · 2020年2月29日

【ACM综述】工业4.0人机交互综述论文，45页pdf，A Survey on Human Machine Interaction in Industry 4.0

【ACM综述】工业4.0人机交互综述论文，45页pdf，A Survey on Human Machine Interaction in Industry 4.0

专知会员服务

59+阅读 · 2020年2月6日

【浙江大学-AAAI2020】领域自适应的对抗损失，Adversarial-Learned Loss for Domain Adaptation

【浙江大学-AAAI2020】领域自适应的对抗损失，Adversarial-Learned Loss for Domain Adaptation

专知会员服务

62+阅读 · 2020年1月11日

【清华大学-微软研究院】构建智能开放域对话系统的挑战综述论文，31页pdf，Challenges in Building Intelligent Open-domain Dialog Systems

【清华大学-微软研究院】构建智能开放域对话系统的挑战综述论文，31页pdf，Challenges in Building Intelligent Open-domain Dialog Systems

专知会员服务

29+阅读 · 2019年11月2日

[综述]基于深度学习的开放领域对话系统研究综述

[综述]基于深度学习的开放领域对话系统研究综述

专知会员服务

80+阅读 · 2019年10月12日

最新《分布式机器学习》论文综述最新DML进展，33页pdf

最新《分布式机器学习》论文综述最新DML进展，33页pdf

专知

52+阅读 · 2019年12月26日

AI更懂人话：谷歌发布全新对话数据集，模仿智能助理

AI更懂人话：谷歌发布全新对话数据集，模仿智能助理

新智元

5+阅读 · 2019年9月7日

已删除

AI掘金志

7+阅读 · 2019年7月8日

动态 | 微软刷新CoQA对话问答挑战赛纪录，模型性能达到人类同等水平

动态 | 微软刷新CoQA对话问答挑战赛纪录，模型性能达到人类同等水平

AI研习社

4+阅读 · 2019年5月8日

自然语言处理常识推理综述论文，60页pdf

自然语言处理常识推理综述论文，60页pdf

专知

73+阅读 · 2019年4月4日

综述 | 事件抽取及推理 (下)

综述 | 事件抽取及推理 (下)

开放知识图谱

38+阅读 · 2019年1月14日

【论文笔记】对话模型新方法，条件DialogWAE生成多模态回答

【论文笔记】对话模型新方法，条件DialogWAE生成多模态回答

专知

15+阅读 · 2018年6月11日

学界 | 谷歌大脑提出自动数据增强方法AutoAugment：可迁移至不同数据集

学界 | 谷歌大脑提出自动数据增强方法AutoAugment：可迁移至不同数据集

机器之心

3+阅读 · 2018年6月2日

一文读懂智能对话系统

一文读懂智能对话系统

数据派THU

16+阅读 · 2018年1月27日

论文笔记 | How NOT To Evaluate Your Dialogue System

论文笔记 | How NOT To Evaluate Your Dialogue System

科技创新与创业

13+阅读 · 2017年12月23日

Few-shot Natural Language Generation for Task-Oriented Dialog

Few-shot Natural Language Generation for Task-Oriented Dialog

Arxiv

30+阅读 · 2020年2月27日

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Arxiv

11+阅读 · 2019年11月4日

Open Domain Event Extraction Using Neural Latent Variable Models

Open Domain Event Extraction Using Neural Latent Variable Models

Arxiv

4+阅读 · 2019年6月17日

Automatic Summarization of Natural Language

Arxiv

3+阅读 · 2018年12月18日

Learning Personalized End-to-End Goal-Oriented Dialog

Arxiv

4+阅读 · 2018年11月12日

Dialogue Natural Language Inference

Arxiv

7+阅读 · 2018年11月1日

Evaluating and Understanding the Robustness of Adversarial Logit Pairing

Arxiv

8+阅读 · 2018年7月26日

Localization Recall Precision (LRP): A New Performance Metric for Object Detection

Localization Recall Precision (LRP): A New Performance Metric for Object Detection

Arxiv

4+阅读 · 2018年7月4日

A Framework for Evaluating 6-DOF Object Trackers

Arxiv

6+阅读 · 2018年3月28日

Progressive Growing of GANs for Improved Quality, Stability, and Variation

Arxiv

3+阅读 · 2017年11月3日

VIP会员

相关主题

Sarik Ghazarian

相关VIP内容

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

专知会员服务

65+阅读 · 2020年5月12日

【论文翻译】2020最新预训练语言模型综述：Pre-trained Models for Natural Language Processing: A Survey

【论文翻译】2020最新预训练语言模型综述：Pre-trained Models for Natural Language Processing: A Survey

专知会员服务

94+阅读 · 2020年4月13日

【ACL2020-浙大-微软】多轮对话推理数据集，MuTual: A Dataset for Multi-Turn Dialogue Reasoning

【ACL2020-浙大-微软】多轮对话推理数据集，MuTual: A Dataset for Multi-Turn Dialogue Reasoning

专知会员服务

37+阅读 · 2020年4月10日

【清华大学】面向任务的对话系统的最新进展和挑战，Task-oriented Dialog System

【清华大学】面向任务的对话系统的最新进展和挑战，Task-oriented Dialog System

专知会员服务

84+阅读 · 2020年3月24日

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

专知会员服务

33+阅读 · 2020年3月23日

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

专知会员服务

33+阅读 · 2020年2月29日

【ACM综述】工业4.0人机交互综述论文，45页pdf，A Survey on Human Machine Interaction in Industry 4.0

【ACM综述】工业4.0人机交互综述论文，45页pdf，A Survey on Human Machine Interaction in Industry 4.0

专知会员服务

59+阅读 · 2020年2月6日

【浙江大学-AAAI2020】领域自适应的对抗损失，Adversarial-Learned Loss for Domain Adaptation

【浙江大学-AAAI2020】领域自适应的对抗损失，Adversarial-Learned Loss for Domain Adaptation

专知会员服务

62+阅读 · 2020年1月11日

【清华大学-微软研究院】构建智能开放域对话系统的挑战综述论文，31页pdf，Challenges in Building Intelligent Open-domain Dialog Systems

【清华大学-微软研究院】构建智能开放域对话系统的挑战综述论文，31页pdf，Challenges in Building Intelligent Open-domain Dialog Systems

专知会员服务

29+阅读 · 2019年11月2日

[综述]基于深度学习的开放领域对话系统研究综述

[综述]基于深度学习的开放领域对话系统研究综述

专知会员服务

80+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

GPT-5如何对齐？从硬性拒绝到安全完成：走向以输出为中心的安全训练

【伯克利博士论文】超越人类监督的视觉智能

【ICCV2025】SO(3) 上连续非保守动力系统的预测

2025年中国数据要素行业发展研究报告

相关资讯

最新《分布式机器学习》论文综述最新DML进展，33页pdf

最新《分布式机器学习》论文综述最新DML进展，33页pdf

专知

52+阅读 · 2019年12月26日

AI更懂人话：谷歌发布全新对话数据集，模仿智能助理

AI更懂人话：谷歌发布全新对话数据集，模仿智能助理

新智元

5+阅读 · 2019年9月7日

已删除

AI掘金志

7+阅读 · 2019年7月8日

动态 | 微软刷新CoQA对话问答挑战赛纪录，模型性能达到人类同等水平

动态 | 微软刷新CoQA对话问答挑战赛纪录，模型性能达到人类同等水平

AI研习社

4+阅读 · 2019年5月8日

自然语言处理常识推理综述论文，60页pdf

自然语言处理常识推理综述论文，60页pdf

专知

73+阅读 · 2019年4月4日

综述 | 事件抽取及推理 (下)

综述 | 事件抽取及推理 (下)

开放知识图谱

38+阅读 · 2019年1月14日

【论文笔记】对话模型新方法，条件DialogWAE生成多模态回答

【论文笔记】对话模型新方法，条件DialogWAE生成多模态回答

专知

15+阅读 · 2018年6月11日

学界 | 谷歌大脑提出自动数据增强方法AutoAugment：可迁移至不同数据集

学界 | 谷歌大脑提出自动数据增强方法AutoAugment：可迁移至不同数据集

机器之心

3+阅读 · 2018年6月2日

一文读懂智能对话系统

一文读懂智能对话系统

数据派THU

16+阅读 · 2018年1月27日

论文笔记 | How NOT To Evaluate Your Dialogue System

论文笔记 | How NOT To Evaluate Your Dialogue System

科技创新与创业

13+阅读 · 2017年12月23日

相关论文

Few-shot Natural Language Generation for Task-Oriented Dialog

Few-shot Natural Language Generation for Task-Oriented Dialog

Arxiv

30+阅读 · 2020年2月27日

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Arxiv

11+阅读 · 2019年11月4日

Open Domain Event Extraction Using Neural Latent Variable Models

Open Domain Event Extraction Using Neural Latent Variable Models

Arxiv

4+阅读 · 2019年6月17日

Automatic Summarization of Natural Language

Arxiv

3+阅读 · 2018年12月18日

Learning Personalized End-to-End Goal-Oriented Dialog

Arxiv

4+阅读 · 2018年11月12日

Dialogue Natural Language Inference

Arxiv

7+阅读 · 2018年11月1日

Evaluating and Understanding the Robustness of Adversarial Logit Pairing

Arxiv

8+阅读 · 2018年7月26日

Localization Recall Precision (LRP): A New Performance Metric for Object Detection

Localization Recall Precision (LRP): A New Performance Metric for Object Detection

Arxiv

4+阅读 · 2018年7月4日

A Framework for Evaluating 6-DOF Object Trackers

Arxiv

6+阅读 · 2018年3月28日

Progressive Growing of GANs for Improved Quality, Stability, and Variation

Arxiv

3+阅读 · 2017年11月3日

微信扫码咨询专知VIP会员