Despite recent advances in abstractive summarization, current summarization systems still suffer from content hallucinations where models generate text that is either irrelevant or contradictory to the source document. However, prior work has been predicated on the assumption that any generated facts not appearing explicitly in the source are undesired hallucinations. Methods have been proposed to address this scenario by ultimately improving `faithfulness' to the source document, but in reality, there is a large portion of entities in the gold reference targets that are not directly in the source. In this work, we show that these entities are not aberrations, but they instead require utilizing external world knowledge to infer reasoning paths from entities in the source. We show that by utilizing an external knowledge base, we can improve the faithfulness of summaries without simply making them more extractive, and additionally, we show that external knowledge bases linked from the source can benefit the factuality of generated summaries.
翻译:尽管在抽象总结方面最近有所进展,但目前的汇总系统仍受到内容幻觉的影响,模型生成的文本与源文件无关或相互矛盾,然而,先前的工作基于以下假设:任何未在源文件中明确出现的事实都是不可取的幻觉,已经提出了处理这一假设的方法,最终改进源文件的`信仰',但实际上,金质参考目标中有很大一部分实体并非直接来自源。在这项工作中,我们表明这些实体不是异常,而是需要利用外部世界知识来推断来源实体的推理路径。我们表明,通过利用外部知识库,我们可以提高摘要的忠实性,而不只是使其更具采掘性,此外,我们表明,与来源相联系的外部知识库能够有利于生成摘要的真实质量。