专题指导的多文件摘要 (Topic-Guided Abstractive Multi-Document Summarization)

A critical point of multi-document summarization (MDS) is to learn the relations among various documents. In this paper, we propose a novel abstractive MDS model, in which we represent multiple documents as a heterogeneous graph, taking semantic nodes of different granularities into account, and then apply a graph-to-sequence framework to generate summaries. Moreover, we employ a neural topic model to jointly discover latent topics that can act as cross-document semantic units to bridge different documents and provide global information to guide the summary generation. Since topic extraction can be viewed as a special type of summarization that "summarizes" texts into a more abstract format, i.e., a topic distribution, we adopt a multi-task learning strategy to jointly train the topic and summarization module, allowing the promotion of each other. Experimental results on the Multi-News dataset demonstrate that our model outperforms previous state-of-the-art MDS models on both Rouge metrics and human evaluation, meanwhile learns high-quality topics.

翻译：多文件总和(MDS)的关键点是学习各种文件之间的关系。在本文中,我们提出一个新的抽象的MDS模型,在模型中我们将多个文件作为多元图解,将不同颗粒的语义节点考虑在内,然后应用一个图表到顺序的框架来生成摘要。此外,我们使用神经专题模型共同发现潜在的议题,这些议题可以作为交叉文档的语义单位来连接不同文件,并提供全球信息来指导摘要的生成。由于专题提取可以被视为一种特殊类型的总结类型,将文本“摘要”化成一种更加抽象的格式,即专题分布,我们采用多任务学习战略来联合培训主题和合成模块,从而相互促进。多新数据集的实验结果表明,我们的模型超越了以前关于红色指标和人类评估的状态的MDS模型,同时学习高质量的专题。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

111+阅读 · 2020年6月10日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日