We argue that disentangling content selection from the budget used to cover salient content improves the performance and applicability of abstractive summarizers. Our method, FactorSum, does this disentanglement by factorizing summarization into two steps through an energy function: (1) generation of abstractive summary views; (2) combination of these views into a final summary, following a budget and content guidance. This guidance may come from different sources, including from an advisor model such as BART or BigBird, or in oracle mode -- from the reference. This factorization achieves significantly higher ROUGE scores on multiple benchmarks for long document summarization, namely PubMed, arXiv, and GovReport. Most notably, our model is effective for domain adaptation. When trained only on PubMed samples, it achieves a 46.29 ROUGE-1 score on arXiv, which indicates a strong performance due to more flexible budget adaptation and content selection less dependent on domain-specific textual structure.
翻译:我们认为,将内容选择与用于涵盖突出内容的预算脱钩,可以提高抽象总结器的性能和适用性。我们的方法,即BigestSum,通过将一个能源功能的汇总化为两个步骤来进行分解:(1) 产生抽象摘要观点;(2) 根据预算和内容指导,将这些观点合并为最后摘要;这种指导可能来自不同来源,包括BART或BigBird等顾问模式,或参考的无序模式。这种分解在长期文件汇总化(即PubMed、arXiv和Gov Report)的多个基准上取得了显著高得多的ROUGE分数。最明显的是,我们的模型对域适应是有效的。当仅对PubMed样本进行培训时,它就实现了对arxiv的46.29 ROUGE-1分,这表明由于预算适应更加灵活,内容选择对具体领域文本结构的依赖性较低,因此业绩很强。