Abstractive summarization systems leveraging pre-training language models have achieved superior results on benchmark datasets. However, such models have been shown to be more prone to hallucinate facts that are unfaithful to the input context. In this paper, we propose a method to remedy entity-level extrinsic hallucinations with Entity Coverage Control (ECC). We first compute entity coverage precision and prepend the corresponding control code for each training example, which implicitly guides the model to recognize faithfulness contents in the training phase. We further extend our method via intermediate fine-tuning on large but noisy data extracted from Wikipedia to unlock zero-shot summarization. We show that the proposed method leads to more faithful and salient abstractive summarization in supervised fine-tuning and zero-shot settings according to our experimental results on three benchmark datasets XSum, Pubmed, and SAMSum of very different domains and styles.
翻译:利用培训前语言模型的抽象总结系统在基准数据集方面取得了优异的成果,然而,这些模型被证明更容易产生不符合投入背景的幻觉事实。在本文件中,我们提出了一个用实体覆盖控制(ECC)来补救实体一级的外部幻觉的方法。我们首先计算实体覆盖面的精确度,并为每个培训实例预先设定相应的控制代码,这些代码暗含地指导模型在培训阶段承认忠诚的内容。我们通过对从Wikipedia提取的大但吵闹的数据进行中间微调,进一步扩展我们的方法,以解开零光总和。我们表明,根据我们关于三个不同领域和风格的基准数据集XSum、Pubmed和SAMSum的实验结果,拟议方法导致在受监督的微调和零光环境中更加可信和突出的抽象总结。