Traditional training paradigms for extractive and abstractive summarization systems always only use token-level or sentence-level training objectives. However, the output summary is always evaluated from summary-level which leads to the inconsistency in training and evaluation. In this paper, we propose a Contrastive Learning based re-ranking framework for one-stage summarization called COLO. By modeling a contrastive objective, we show that the summarization model is able to directly generate summaries according to the summary-level score without additional modules and parameters. Extensive experiments demonstrate that COLO boosts the extractive and abstractive results of one-stage systems on CNN/DailyMail benchmark to 44.58 and 46.33 ROUGE-1 score while preserving the parameter efficiency and inference efficiency. Compared with state-of-the-art multi-stage systems, we save more than 100 GPU training hours and obtaining 3~8 speed-up ratio during inference while maintaining comparable results.
翻译:传统的抽取式和生成式摘要系统的训练范式通常只使用基于标记或句子的训练目标。然而,输出的摘要总是从摘要级别进行评估,这导致训练和评估不一致。在本文中,我们提出了一种基于对比学习的一阶段摘要重排框架COLO。通过建立对比学习目标,我们展示了摘要模型能够直接根据摘要级别评分生成摘要,无需额外的模块和参数。广泛的实验表明,COLO将CNN/DailyMail基准上一阶段系统的抽取式和生成式结果提升到44.58和46.33 ROUGE-1得分,同时保持参数效率和推理效率。与最先进的多阶段系统相比,我们在训练时节省了超过100个GPU小时,在推理过程中获得了3-8倍的加速比,同时保持了可比较的结果。