Community Question Answering (CQA) fora such as Stack Overflow and Yahoo! Answers contain a rich resource of answers to a wide range of community-based questions. Each question thread can receive a large number of answers with different perspectives. One goal of answer summarization is to produce a summary that reflects the range of answer perspectives. A major obstacle for abstractive answer summarization is the absence of a dataset to provide supervision for producing such summaries. Recent works propose heuristics to create such data, but these are often noisy and do not cover all perspectives present in the answers. This work introduces a novel dataset of 4,631 CQA threads for answer summarization, curated by professional linguists. Our pipeline gathers annotations for all subtasks involved in answer summarization, including the selection of answer sentences relevant to the question, grouping these sentences based on perspectives, summarizing each perspective, and producing an overall summary. We analyze and benchmark state-of-the-art models on these subtasks and introduce a novel unsupervised approach for multi-perspective data augmentation, that further boosts overall summarization performance according to automatic evaluation. Finally, we propose reinforcement learning rewards to improve factual consistency and answer coverage and analyze areas for improvement.
翻译:社区问题解答( CQA) 论坛, 如 Stack Overflow 和 Yahoo 。 答案包含对一系列社区问题解答的丰富资源。 每条问题线索都可以从不同角度获得大量解答。 答案总和的一个目标是产生一个反映解答观点的概要。 抽象解答总和的一个主要障碍是缺少一个数据集来监督这种摘要的编制工作。 最近的工作提出了创建这类数据的理论,但这些数据往往很吵闹,没有涵盖答案中的所有观点。 这项工作引入了一套4,631 CQA线索的新数据集,由专业语言学家整理,用于回答总和。 我们的管道收集了与解答总和相关的所有子任务的说明,包括选择与问题相关的答案句子,根据视角将这些句子组合起来,概述每个视角,并产生一个总体摘要。 我们对这些子任务中的最新模型进行分析和基准,并引入了一种新型的、不统一的方法,用于多视角数据增强, 由专业语言语言语言学家整理, 从而进一步提升整体分析质量。