Text summarization models are approaching human levels of fidelity. Existing benchmarking corpora provide concordant pairs of full and abridged versions of Web, news or, professional content. To date, all summarization datasets operate under a one-size-fits-all paradigm that may not reflect the full range of organic summarization needs. Several recently proposed models (e.g., plug and play language models) have the capacity to condition the generated summaries on a desired range of themes. These capacities remain largely unused and unevaluated as there is no dedicated dataset that would support the task of topic-focused summarization. This paper introduces the first topical summarization corpus NEWTS, based on the well-known CNN/Dailymail dataset, and annotated via online crowd-sourcing. Each source article is paired with two reference summaries, each focusing on a different theme of the source document. We evaluate a representative range of existing techniques and analyze the effectiveness of different prompting methods.
翻译:现有基准公司提供了完整和简略版的网络、新闻或专业内容的一致配对。到目前为止,所有汇总数据集都是在可能无法反映有机汇总需要全方位的一刀切范式下运作的。最近提出的几个模型(例如插头和游戏语言模型)有能力将生成的摘要以理想的主题范围作为条件。这些能力在很大程度上仍然未使用和未经评价,因为没有专门数据集支持以专题为重点的汇总任务。本文介绍了第一个专题汇总全套新知识,其依据是众所周知的CNN/Dailymail数据集,并通过在线众包提供附加说明。每个来源文章都配有两份参考摘要,每个参考摘要都侧重于源文件的不同主题。我们评估了具有代表性的各种现有技术,并分析了不同快速方法的有效性。