质疑汇总数据集的有效性并改善其事实一致性 (Questioning the Validity of Summarization Datasets and Improving Their Factual Consistency)

The topic of summarization evaluation has recently attracted a surge of attention due to the rapid development of abstractive summarization systems. However, the formulation of the task is rather ambiguous, neither the linguistic nor the natural language processing community has succeeded in giving a mutually agreed-upon definition. Due to this lack of well-defined formulation, a large number of popular abstractive summarization datasets are constructed in a manner that neither guarantees validity nor meets one of the most essential criteria of summarization: factual consistency. In this paper, we address this issue by combining state-of-the-art factual consistency models to identify the problematic instances present in popular summarization datasets. We release SummFC, a filtered summarization dataset with improved factual consistency, and demonstrate that models trained on this dataset achieve improved performance in nearly all quality aspects. We argue that our dataset should become a valid benchmark for developing and evaluating summarization systems.

翻译：总结评价专题最近由于抽象总结系统的迅速发展而引起人们的高度关注,然而,任务的拟订相当模糊,无论是语言界还是自然语言处理界都没有成功地作出相互同意的定义。由于缺乏明确界定的表述,大量流行的抽象总结数据集的构建方式既不能保证有效性,也不符合总结的最基本标准之一:事实一致性。在本文件中,我们通过将最新的事实一致性模型结合起来来解决这一问题,以查明大众总结数据集中存在的问题实例。我们发行了SummFC,这是一套经过过滤的总结数据集,其实际一致性得到了提高,并表明在这种数据集上培训的模型几乎在所有质量方面都取得了更好的绩效。我们主张,我们的数据集应当成为发展和评估汇总系统的有效基准。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日