Dialogue summarization aims to condense the original dialogue into a shorter version covering salient information, which is a crucial way to reduce dialogue data overload. Recently, the promising achievements in both dialogue systems and natural language generation techniques drastically lead this task to a new landscape, which results in significant research attentions. However, there still remains a lack of a comprehensive survey for this task. To this end, we take the first step and present a thorough review of this research field carefully and widely. In detail, we systematically organize the current works according to the characteristics of each domain, covering meeting, chat, email thread, customer service and medical dialogue. Additionally, we provide an overview of publicly available research datasets as well as organize two leaderboards under unified metrics. Furthermore, we discuss some future directions, including faithfulness, multi-modal, multi-domain and multi-lingual dialogue summarization, and give our thoughts respectively. We hope that this first survey of dialogue summarization can provide the community with a quick access and a general picture to this task and motivate future researches.
翻译:对话总结旨在将原始对话压缩为较短的版本,涵盖突出的信息,这是减少对话数据超载的关键途径。最近,对话系统和自然语言生成技术的可喜成就使这一任务进入新的景观,从而引起重要的研究关注。然而,仍然缺乏对这项任务的全面调查。为此目的,我们迈出第一步,认真和广泛地对这一研究领域进行彻底审查。我们根据每个领域的特点,包括会议、聊天、电子邮件线索、客户服务和医疗对话,系统组织目前的工作。此外,我们概述了公开提供的研究数据集,并根据统一的标准组织两个领导板。此外,我们讨论一些未来方向,包括忠诚、多模式、多主题和多语言对话的总结,并分别提出我们的想法。我们希望,第一次对话总结调查能够迅速为社区提供这项任务的准入和总体情况,并激励今后的研究。