In this paper, we aim to improve abstractive dialogue summarization quality and, at the same time, enable granularity control. Our model has two primary components and stages: 1) a two-stage generation strategy that generates a preliminary summary sketch serving as the basis for the final summary. This summary sketch provides a weakly supervised signal in the form of pseudo-labeled interrogative pronoun categories and key phrases extracted using a constituency parser. 2) A simple strategy to control the granularity of the final summary, in that our model can automatically determine or control the number of generated summary sentences for a given dialogue by predicting and highlighting different text spans from the source text. Our model achieves state-of-the-art performance on the largest dialogue summarization corpus SAMSum, with as high as 50.79 in ROUGE-L score. In addition, we conduct a case study and show competitive human evaluation results and controllability to human-annotated summaries.
翻译:在本文中,我们的目标是改进抽象对话总结质量,同时实现颗粒控制。我们的模型有两个主要组成部分和阶段:1)两阶段的生成战略,产生初步简要草图,作为最后摘要的基础。这一简要草图以假标签的质子类别和关键词的形式,提供了受监督不力的信号,其形式为假标签的质子类别和关键词;2)控制最后摘要颗粒性的一个简单战略,即我们的模型可以通过预测和突出来源文本的不同文本来自动确定或控制特定对话生成的简要句子的数量。我们的模型在最大对话总和堆SAMSum上取得了最先进的表现,在ROUGE-L的评分中高达50.79。此外,我们进行案例研究,并展示竞争性的人类评价结果和对人类附加注释的摘要的可控制性。