State-of-the-art abstractive summarization systems often generate \emph{hallucinations}; i.e., content that is not directly inferable from the source text. Despite being assumed incorrect, we find that much hallucinated content is factual, namely consistent with world knowledge. These factual hallucinations can be beneficial in a summary by providing useful background information. In this work, we propose a novel detection approach that separates factual from non-factual hallucinations of entities. Our method utilizes an entity's prior and posterior probabilities according to pre-trained and finetuned masked language models, respectively. Empirical results suggest that our approach vastly outperforms two baselines %in both accuracy and F1 scores and strongly correlates with human judgments. % on factuality classification tasks. Furthermore, we show that our detector, when used as a reward signal in an off-line reinforcement learning (RL) algorithm, significantly improves the factuality of summaries while maintaining the level of abstractiveness.
翻译:这些事实幻觉可以通过提供有用的背景资料在摘要中有所助益。在这项工作中,我们提出了一种新的检测方法,将实体的事实和实体的非事实幻觉区分开来。我们的方法分别使用预先培训和经过微调的隐蔽语言模型,分别使用实体的先前和事后概率。经验性结果表明,我们的方法在精确度和F1分上大大超过两个基线%,并且与人类判断密切相关。关于事实质量分类任务的百分比。此外,我们还表明,我们的检测器在用于非在线强化学习(RL)算法中用作奖励信号时,在保持抽象程度的同时,大大改进了摘要的真实性。