With the rapid increase in the volume of dialogue data from daily life, there is a growing demand for dialogue summarization. Unfortunately, training a large summarization model is generally infeasible due to the inadequacy of dialogue data with annotated summaries. Most existing works for low-resource dialogue summarization directly pretrain models in other domains, e.g., the news domain, but they generally neglect the huge difference between dialogues and conventional articles. To bridge the gap between out-of-domain pretraining and in-domain fine-tuning, in this work, we propose a multi-source pretraining paradigm to better leverage the external summary data. Specifically, we exploit large-scale in-domain non-summary data to separately pretrain the dialogue encoder and the summary decoder. The combined encoder-decoder model is then pretrained on the out-of-domain summary data using adversarial critics, aiming to facilitate domain-agnostic summarization. The experimental results on two public datasets show that with only limited training data, our approach achieves competitive performance and generalizes well in different dialogue scenarios.
翻译:由于日常生活中对话数据的数量迅速增加,对对话总量的总结需求不断增加。不幸的是,由于缺少附有附加说明的摘要的对话数据,培训一个大型的总结模型一般不可行。大多数现有的低资源对话总结在其他领域直接预演模型的工作,例如新闻领域,但一般忽视了对话和传统文章之间的巨大差异。为了缩小在培训前工作外和在培训中微调之间的差距,我们在此工作中提出了一个多来源的培训前模式,以便更好地利用外部摘要数据。具体地说,我们利用大型非摘要数据单独为对话编码和摘要解码器预作准备。然后,合并的编码解码模型先于使用抗辩批评家的外部摘要数据进行训练,目的是便利域的总结。两个公共数据集的实验结果显示,只有有限的培训数据,我们的方法才能实现竞争性的运行,并在不同的对话情景中非常概括。