Recently, various neural encoder-decoder models pioneered by Seq2Seq framework have been proposed to achieve the goal of generating more abstractive summaries by learning to map input text to output text. At a high level, such neural models can freely generate summaries without any constraint on the words or phrases used. Moreover, their format is closer to human-edited summaries and output is more readable and fluent. However, the neural model's abstraction ability is a double-edged sword. A commonly observed problem with the generated summaries is the distortion or fabrication of factual information in the article. This inconsistency between the original text and the summary has caused various concerns over its applicability, and the previous evaluation methods of text summarization are not suitable for this issue. In response to the above problems, the current research direction is predominantly divided into two categories, one is to design fact-aware evaluation metrics to select outputs without factual inconsistency errors, and the other is to develop new summarization systems towards factual consistency. In this survey, we focus on presenting a comprehensive review of these fact-specific evaluation methods and text summarization models.
翻译:最近,由Seq2Seq框架引领的各种神经编码器-解码器模型被提出,以实现通过学习将输入文本映射到输出文本的目标,从而生成更抽象的摘要。在较高层次上,这些神经模型可以自由生成摘要,而不对所使用的单词或短语加以任何限制。此外,它们的格式更接近于经过人工编辑的摘要,输出更易读且更流畅。然而,神经模型的抽象能力是一个双刃剑。生成的摘要常常出现扭曲或捏造原文中的事实信息的问题。这种原文与摘要之间的不一致性引发了各种关于其适用性的担忧,以前的文本摘要评估方法也不能解决此问题。作为对上述问题的回应,当前的研究方向主要分为两类,一是设计针对事实的评估指标,以选择没有事实不一致性错误的输出,另一种是开发新的摘要系统以实现实际一致性。在本调查中,我们专注于对这些特定于事实的评估方法和文本摘要模型进行全面的回顾。