A commonly observed problem with the state-of-the art abstractive summarization models is that the generated summaries can be factually inconsistent with the input documents. The fact that automatic summarization may produce plausible-sounding yet inaccurate summaries is a major concern that limits its wide application. In this paper we present an approach to address factual consistency in summarization. We first propose an efficient automatic evaluation metric to measure factual consistency; next, we propose a novel learning algorithm that maximizes the proposed metric during model training. Through extensive experiments, we confirm that our method is effective in improving factual consistency and even overall quality of the summaries, as judged by both automatic metrics and human evaluation.
翻译:最先进的抽象总结模型常见的一个常见问题是,产生的摘要在事实上可能与输入文件不一致。自动总结可能产生貌似合理但不准确的摘要,这是限制其广泛应用的一个主要问题。在本文件中,我们提出了一个方法来解决总结中的实际一致性问题。我们首先提出一个有效的自动评价标准,以衡量事实一致性;然后,我们提出一种新的学习算法,在模型培训期间尽量扩大拟议的指标。通过广泛的实验,我们确认我们的方法有效地改进了摘要的实际一致性,甚至整体质量,正如自动计量法和人类评价所判断的那样。