以集群为基础的自动生成文本评价 (Cluster-based Evaluation of Automatically Generated Text) - 专知论文

会员服务 ·

0

簇 · 估计/估计量 · 相关系数 · 可约的 · 语言模型化 ·

2022 年 6 月 1 日

Cluster-based Evaluation of Automatically Generated Text

翻译：以集群为基础的自动生成文本评价

Tiago Pimentel,Clara Meister,Ryan Cotterell

from arxiv, Tiago Pimentel and Clara Meister contributed equally to this work

While probabilistic language generators have improved dramatically over the last few years, the automatic evaluation metrics used to assess them have not kept pace with this progress. In the domain of language generation, a good metric must correlate highly with human judgements. Yet, with few exceptions, there is a lack of such metrics in the literature. In this work, we analyse the general paradigm of language generator evaluation. We first discuss the computational and qualitative issues with using automatic evaluation metrics that operate on probability distributions over strings, the backbone of most language generators. We then propose the use of distributions over clusters instead, where we cluster strings based on their text embeddings (obtained from a pretrained language model). While we find the biases introduced by this substitution to be quite strong, we observe that, empirically, this methodology leads to metric estimators with higher correlation with human judgements, while simultaneously reducing estimator variance. We finish the paper with a probing analysis, which leads us to conclude that -- by encoding syntactic- and coherence-level features of text, while ignoring surface-level features -- these clusters may simply be better equipped to evaluate state-of-the-art language models.

翻译：虽然过去几年来概率语言生成器有了显著改善,但用于评估这些生成器的自动评价指标却跟不上这一进展。在语言生成领域,一个良好的衡量标准必须与人类判断高度相关。然而,除了少数例外,文献中缺乏这样的衡量标准。在这项工作中,我们分析语言生成器评价的一般模式。我们首先通过使用自动评价指标来讨论计算和质量问题,该评价指标以大多数语言生成器的骨干 -- -- 字符串之间的概率分布方式运作。我们然后提议使用在组群之间的分布,而我们则在组群的文本嵌嵌入(由预先培训的语言模型组成)的基础上将字符分组。我们虽然发现这种替换带来的偏差相当强烈,但从经验上看,我们发现这一方法导致与人类判断器的关联性更高,同时缩小了估计器的差异。我们先用一个预测性分析来完成文件,从而得出这样的结论:通过编码综合和一致性水平的文本特征,同时忽略地表水平的特征,这些组群群群群群可能更有能力评估状态语言模型。

0

相关内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Plücker直线摄影测量的理论与方法

国家自然科学基金

0+阅读 · 2014年12月31日

金属氢化物电极真空弧离子源离子成分和电荷态分布研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

拟南芥DIF（DRIP1-Interacting Factor）在胁迫信号应答中的功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

编码密码学中若干组合对象研究

国家自然科学基金

0+阅读 · 2009年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

组蛋白乙酰化/去乙酰化对Myocardin诱导的心肌肥厚影响及机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

铁电配合物的合成，结构与性质研究

国家自然科学基金

0+阅读 · 2008年12月31日

Doubly Robust Distributionally Robust Off-Policy Evaluation and Learning

Arxiv

0+阅读 · 2022年7月18日

A 2-step estimation procedure for semiparametric mixture cure models

Arxiv

0+阅读 · 2022年7月17日

An Empirical Study of Automated Unit Test Generation for Python

Arxiv

0+阅读 · 2022年7月17日

On the Usefulness of Deep Ensemble Diversity for Out-of-Distribution Detection

Arxiv

0+阅读 · 2022年7月15日

A Theoretically Grounded Benchmark for Evaluating Machine Commonsense

Arxiv

0+阅读 · 2022年7月14日

Causal Graphs Underlying Generative Models: Path to Learning with Limited Data

Arxiv

0+阅读 · 2022年7月14日

A Comprehensive Survey on Community Detection with Deep Learning

Arxiv

14+阅读 · 2021年5月26日

Meta-Learning to Cluster

Meta-Learning to Cluster

Arxiv

18+阅读 · 2019年10月30日

Transferring Common-Sense Knowledge for Object Detection

Arxiv

12+阅读 · 2018年4月3日

Learning beyond datasets: Knowledge Graph Augmented Neural Networks for Natural language Processing

Arxiv

11+阅读 · 2018年2月16日

VIP会员

文章信息

相关主题

估计/估计量

语言模型化

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《俄乌战争中的无人系统：新的战争方式与新兴趋势——来自前线的印象》报告

《海上自主水面船舶远程操作中心：安全可持续运行的多维度分析》

多模态大语言模型下游调优中“保持自我”的重要性

隐身自主无人水下航行器技术如何变革水下作战并重塑海军竞争

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Doubly Robust Distributionally Robust Off-Policy Evaluation and Learning

Arxiv

0+阅读 · 2022年7月18日

A 2-step estimation procedure for semiparametric mixture cure models

Arxiv

0+阅读 · 2022年7月17日

An Empirical Study of Automated Unit Test Generation for Python

Arxiv

0+阅读 · 2022年7月17日

On the Usefulness of Deep Ensemble Diversity for Out-of-Distribution Detection

Arxiv

0+阅读 · 2022年7月15日

A Theoretically Grounded Benchmark for Evaluating Machine Commonsense

Arxiv

0+阅读 · 2022年7月14日

Causal Graphs Underlying Generative Models: Path to Learning with Limited Data

Arxiv

0+阅读 · 2022年7月14日

A Comprehensive Survey on Community Detection with Deep Learning

Arxiv

14+阅读 · 2021年5月26日

Meta-Learning to Cluster

Meta-Learning to Cluster

Arxiv

18+阅读 · 2019年10月30日

Transferring Common-Sense Knowledge for Object Detection

Arxiv

12+阅读 · 2018年4月3日

Learning beyond datasets: Knowledge Graph Augmented Neural Networks for Natural language Processing

Arxiv

11+阅读 · 2018年2月16日

相关基金

Plücker直线摄影测量的理论与方法

国家自然科学基金

0+阅读 · 2014年12月31日

金属氢化物电极真空弧离子源离子成分和电荷态分布研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

拟南芥DIF（DRIP1-Interacting Factor）在胁迫信号应答中的功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

编码密码学中若干组合对象研究

国家自然科学基金

0+阅读 · 2009年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

组蛋白乙酰化/去乙酰化对Myocardin诱导的心肌肥厚影响及机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

铁电配合物的合成，结构与性质研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员