Despite significant progress, state-of-the-art abstractive summarization methods are still prone to hallucinate content inconsistent with the source document. In this paper, we propose Constrained Abstractive Summarization (CAS), a general setup that preserves the factual consistency of abstractive summarization by specifying tokens as constraints that must be present in the summary. We adopt lexically constrained decoding, a technique generally applicable to autoregressive generative models, to fulfill CAS and conduct experiments in two scenarios: (1) automatic summarization without human involvement, where keyphrases are extracted from the source document and used as constraints; (2) human-guided interactive summarization, where human feedback in the form of manual constraints are used to guide summary generation. Automatic and human evaluations on two benchmark datasets demonstrate that CAS improves both lexical overlap (ROUGE) and factual consistency of abstractive summarization. In particular, we observe up to 13.8 ROUGE-2 gains when only one manual constraint is used in interactive summarization.
翻译:尽管取得了重大的进展,最先进的抽象总结方法仍然容易产生与原始文件不一致的幻觉。在本文件中,我们提议采用封闭式抽象总结(CAS)这一总体设置,将抽象总结(CAS)的实际一致性保留在摘要中必须列出的限制因素中。我们采用了一种通常适用于自动递减基因模型的具有法律限制的解码技术,以在两种假设中完成化学文摘(CAS)并进行实验:(1) 自动总结而无需人的参与,关键词是从原始文件提取的,并用作限制;(2) 以人工制约的形式对人进行反馈,用于指导摘要制作;对两个基准数据集进行自动和人类评价,表明化学文摘(ROUGE)改进了词汇重叠(ROUGE)和抽象总结(CE)的实际一致性。特别是,当交互式总结只使用一个手工约束时,我们观察到13.8 ROUGE-2的成果。