Pre-trained and fine-tuned news summarizers are expected to generalize to news articles unseen in the fine-tuning (training) phase. However, these articles often contain specifics, such as new events and people, a summarizer could not learn about in training. This applies to scenarios such as a news publisher training a summarizer on dated news and summarizing incoming recent news. In this work, we explore the first application of transductive learning to summarization where we further fine-tune models on test set inputs. Specifically, we construct pseudo summaries from salient article sentences and input randomly masked articles. Moreover, this approach is also beneficial in the fine-tuning phase, where we jointly predict extractive pseudo references and abstractive gold summaries in the training set. We show that our approach yields state-of-the-art results on CNN/DM and NYT datasets, improving ROUGE-L by 1.05 and 0.74, respectively. Importantly, our approach does not require any changes of the original architecture. Moreover, we show the benefits of transduction from dated to more recent CNN news. Finally, through human and automatic evaluation, we demonstrate improvements in summary abstractiveness and coherence.
翻译:事先经过培训、经过微调的新闻摘要集预计会推广到微调(培训)阶段所看不到的新闻文章,然而,这些文章往往包含一些细节,例如新的事件和人员,一个摘要集无法在培训中学习,这适用于新闻出版商培训关于过时新闻的概要和对最新新闻的总结等情景。在这项工作中,我们探索了将传输学习首次应用于总结,我们进一步微调测试集投入的模型。具体地说,我们从突出文章句和随机遮盖文章中编造假摘要。此外,这一方法还有益于微调阶段,我们联合预测成套培训中的采掘假参考资料和抽象黄金摘要。我们表明,我们的方法产生了CNN/DM和NYT数据集的最新成果,分别改善了1.05和0.74的ROUGE-L。重要的是,我们的方法不需要对原始结构作任何改动。此外,我们展示了从最近CNN新闻到最近新闻的转换的好处。最后,我们通过人和自动评价,展示了抽象性和一致性方面的改进。