Extractive summarization suffers from irrelevance, redundancy and incoherence. Existing work shows that abstractive rewriting for extractive summaries can improve the conciseness and readability. These rewriting systems consider extracted summaries as the only input, which is relatively focused but can lose important background knowledge. In this paper, we investigate contextualized rewriting, which ingests the entire original document. We formalize contextualized rewriting as a seq2seq problem with group alignments, introducing group tag as a solution to model the alignments, identifying extracted summaries through content-based addressing. Results show that our approach significantly outperforms non-contextualized rewriting systems without requiring reinforcement learning, achieving strong improvements on ROUGE scores upon multiple extractive summarizers.
翻译:现有工作表明,为提取摘要进行抽象的重写可以改进简洁性和可读性。这些重写系统将提取摘要视为唯一的投入,这种投入相对集中,但可能失去重要的背景知识。在本文中,我们调查背景化重写,它吸收了整个原始文件。我们将背景化重写与群体校正正式确定为后继2等值问题,引入群体标签作为调整模式的一种解决办法,通过基于内容的地址识别提取摘要。结果显示,我们的方法大大超过非文本化重写系统,而不需要强化学习,在多个提取摘要中实现了对ROUGE分数的有力改进。