Despite significant progress in neural abstractive summarization, recent studies have shown that the current models are prone to generating summaries that are unfaithful to the original context. To address the issue, we study contrast candidate generation and selection as a model-agnostic post-processing technique to correct the extrinsic hallucinations (i.e. information not present in the source text) in unfaithful summaries. We learn a discriminative correction model by generating alternative candidate summaries where named entities and quantities in the generated summary are replaced with ones with compatible semantic types from the source document. This model is then used to select the best candidate as the final output summary. Our experiments and analysis across a number of neural summarization systems show that our proposed method is effective in identifying and correcting extrinsic hallucinations. We analyze the typical hallucination phenomenon by different types of neural summarization systems, in hope to provide insights for future work on the direction.
翻译:尽管在神经抽象总结方面取得了显著进展,但最近的研究表明,目前的模型很容易产生不符合原始背景的概要。为了解决这个问题,我们研究将候选人的产生和选择作为模型的不可知后处理技术,以不忠的概要纠正外在幻觉(即原始文本中未出现的信息)。我们通过产生替代的候选摘要,将生成的摘要中点名的实体和数量替换为与源文件中的语义类型相容的缩略图,学习了一种歧视性的纠正模式。然后,该模型被用来选择最佳的候选者作为最终产出摘要。我们针对若干神经合成系统的实验和分析表明,我们所提议的方法在查明和纠正外在幻觉方面是有效的。我们通过不同种类的神经合成系统分析典型的幻觉现象,希望为今后关于方向的工作提供见解。