Community Question Answering (CQA) forums such as Stack Overflow and Yahoo! Answers contain a rich resource of answers to a wide range of questions. Each question thread can receive a large number of answers with different perspectives. The goal of multi-perspective answer summarization is to produce a summary that includes all perspectives of the answer. A major obstacle for multi-perspective, abstractive answer summarization is the absence of a dataset to provide supervision for producing such summaries. This work introduces a novel dataset creation method to automatically create multi-perspective, bullet-point abstractive summaries from an existing CQA forum. Supervision provided by this dataset trains models to inherently produce multi-perspective summaries. Additionally, to train models to output more diverse, faithful answer summaries while retaining multiple perspectives, we propose a multi-reward optimization technique coupled with a sentence-relevance prediction multi-task loss. Our methods demonstrate improved coverage of perspectives and faithfulness as measured by automatic and human evaluations compared to a strong baseline.
翻译:社区问题解答(CQA)论坛,如Stack Overflow 和 Yahoo 。 答案包含对一系列广泛问题回答的丰富资源。 每个问题线索都可以从不同角度获得大量答案。 多视角解答总和的目标是产生一个包含答案所有观点的概要。 多视角、抽象解答总和的主要障碍是缺少一个数据集来监督这些摘要的制作。 这项工作引入了一个新的数据集创建方法, 以自动生成现有 CQA 论坛的多视角、 圆点抽象摘要。 这个数据集模型提供的监管模式提供了内在生成多视角摘要的内在性。 此外, 为了在培训模型以输出更加多样化、 忠实的回答摘要的同时保留多重视角, 我们提出了一种多视角优化技术, 加上一个与句点相关预测的多任务损失。 我们的方法显示,通过自动和人文评估衡量的观点和忠诚的覆盖面比一个强大的基线更加广。