多视角抽象答案总结 (Multi-Perspective Abstractive Answer Summarization)

Community Question Answering (CQA) forums such as Stack Overflow and Yahoo! Answers contain a rich resource of answers to a wide range of questions. Each question thread can receive a large number of answers with different perspectives. The goal of multi-perspective answer summarization is to produce a summary that includes all perspectives of the answer. A major obstacle for multi-perspective, abstractive answer summarization is the absence of a dataset to provide supervision for producing such summaries. This work introduces a novel dataset creation method to automatically create multi-perspective, bullet-point abstractive summaries from an existing CQA forum. Supervision provided by this dataset trains models to inherently produce multi-perspective summaries. Additionally, to train models to output more diverse, faithful answer summaries while retaining multiple perspectives, we propose a multi-reward optimization technique coupled with a sentence-relevance prediction multi-task loss. Our methods demonstrate improved coverage of perspectives and faithfulness as measured by automatic and human evaluations compared to a strong baseline.

翻译：社区问题解答(CQA)论坛,如Stack Overflow 和 Yahoo 。答案包含对一系列广泛问题回答的丰富资源。每个问题线索都可以从不同角度获得大量答案。多视角解答总和的目标是产生一个包含答案所有观点的概要。多视角、抽象解答总和的主要障碍是缺少一个数据集来监督这些摘要的制作。这项工作引入了一个新的数据集创建方法, 以自动生成现有 CQA 论坛的多视角、圆点抽象摘要。这个数据集模型提供的监管模式提供了内在生成多视角摘要的内在性。此外, 为了在培训模型以输出更加多样化、忠实的回答摘要的同时保留多重视角, 我们提出了一种多视角优化技术, 加上一个与句点相关预测的多任务损失。我们的方法显示,通过自动和人文评估衡量的观点和忠诚的覆盖面比一个强大的基线更加广。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

神经问题生成前沿综述

专知会员服务

25+阅读 · 2021年6月22日

自然语言生成综述

专知会员服务

65+阅读 · 2021年5月29日

【IJCAI2020】神经摘要结构性注意力，Neural Abstractive Summarization with Structural Attention

专知会员服务

33+阅读 · 2020年4月24日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日