在线新闻讨论摘要 (Neural Abstractive Unsupervised Summarization of Online News Discussions)

Summarization has usually relied on gold standard summaries to train extractive or abstractive models. Social media brings a hurdle to summarization techniques since it requires addressing a multi-document multi-author approach. We address this challenging task by introducing a novel method that generates abstractive summaries of online news discussions. Our method extends a BERT-based architecture, including an attention encoding that fed comments' likes during the training stage. To train our model, we define a task which consists of reconstructing high impact comments based on popularity (likes). Accordingly, our model learns to summarize online discussions based on their most relevant comments. Our novel approach provides a summary that represents the most relevant aspects of a news item that users comment on, incorporating the social context as a source of information to summarize texts in online social networks. Our model is evaluated using ROUGE scores between the generated summary and each comment on the thread. Our model, including the social attention encoding, significantly outperforms both extractive and abstractive summarization methods based on such evaluation.

翻译：总结通常依靠黄金标准摘要来培训采掘或抽象模型。社交媒体给总结技术带来障碍,因为它需要处理多文档多作者方法。我们通过引入一种新颖的方法来应对这项具有挑战性的任务,即产生网上新闻讨论的抽象摘要。我们的方法扩展了基于BERT的架构, 包括一个在培训阶段提供类似评论的注意编码。为了培训我们的模型, 我们定义了一项任务, 包括重建基于受欢迎程度( 类似) 的高影响评论。因此, 我们的模型学会根据最相关的评论来总结网上讨论。我们的新方法提供了一份摘要, 代表了用户评论的新闻项目最相关的方面, 将社会背景作为信息来源, 用于汇总在线社交网络文本。我们的模型使用生成摘要和对线索的每一项评论之间的ROUGE评分进行评估。我们的模型, 包括社会关注编码, 大大超越了基于这种评价的采掘和抽象总结方法。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

自然语言处理顶会EMNLP2020接受论文列表，754篇论文都在这儿了！

专知会员服务

28+阅读 · 2020年10月26日

【ICML2020】文本摘要生成模型PEGASUS

专知会员服务

35+阅读 · 2020年8月23日

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【ICML2020-Google】预训练提取的空白句子以便进行抽象摘要

专知会员服务

20+阅读 · 2020年7月1日