指导课程的抽象摘要总结 (Curriculum-Guided Abstractive Summarization)

Recent Transformer-based summarization models have provided a promising approach to abstractive summarization. They go beyond sentence selection and extractive strategies to deal with more complicated tasks such as novel word generation and sentence paraphrasing. Nonetheless, these models have two shortcomings: (1) they often perform poorly in content selection, and (2) their training strategy is not quite efficient, which restricts model performance. In this paper, we explore two orthogonal ways to compensate for these pitfalls. First, we augment the Transformer network with a sentence cross-attention module in the decoder, encouraging more abstraction of salient content. Second, we include a curriculum learning approach to reweight the training samples, bringing about an efficient learning procedure. Our second approach to enhance the training strategy of Transformers networks makes stronger gains as compared to the first approach. We apply our model on extreme summarization dataset of Reddit TIFU posts. We further look into three cross-domain summarization datasets (Webis-TLDR-17, CNN/DM, and XSum), measuring the efficacy of curriculum learning when applied in summarization. Moreover, a human evaluation is conducted to show the efficacy of the proposed method in terms of qualitative criteria, namely, fluency, informativeness, and overall quality.

翻译：最近以变异器为基础的总结模型为抽象归纳提供了一种很有希望的方法,它们超越了刑罚选择和采掘战略,以便处理更复杂的任务,如新颖的生成单词和句子参数,然而,这些模型有两个缺点:(1) 在内容选择方面往往表现不佳,(2) 其培训战略效率不高,这限制了模型性能。在本文件中,我们探索了两种正统方法来弥补这些缺陷。首先,我们增加了变异器网络,在解码器中增加了一个跨注意的句号模块,鼓励了突出内容的更抽象化。第二,我们增加了课程学习方法,对培训样本进行再加权,从而形成一个有效的学习程序。我们加强变异器网络培训战略的第二个方法与第一种方法相比,取得了更大的收益。我们应用了我们关于Reddit TIFU Post的极端加对称数据集的模型。我们进一步研究了三个交叉加对称数据集(Webis-TLDR-17、CNN/DM和XSum), 测量课程学习在总结中应用时的效果,即定性、全面质量评估方法。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日