Knowledge-grounded dialogue systems are challenging to build due to the lack of training data and heterogeneous knowledge sources. Existing systems perform poorly on unseen topics due to limited topics covered in the training data. In addition, heterogeneous knowledge sources make it challenging for systems to generalize to other tasks because knowledge sources in different knowledge representations require different knowledge encoders. To address these challenges, we present PLUG, a language model that homogenizes different knowledge sources to a unified knowledge representation for knowledge-grounded dialogue generation tasks. PLUG is pre-trained on a dialogue generation task conditioned on a unified essential knowledge representation. It can generalize to different downstream knowledge-grounded dialogue generation tasks with a few training examples. The empirical evaluation on two benchmarks shows that our model generalizes well across different knowledge-grounded tasks. It can achieve comparable performance with state-of-the-art methods under a fully-supervised setting and significantly outperforms other methods in zero-shot and few-shot settings.
翻译:由于缺乏培训数据和多种知识来源,建立基于知识的对话系统是困难的; 现有系统由于培训数据所涵盖的专题有限,在无法见的专题上工作不力; 此外,由于不同知识形式中的知识来源需要不同的知识编码器,不同知识来源需要不同的知识编码器,因此不同知识来源难以推广到其他任务; 为了应对这些挑战,我们介绍了PLUG, 这是一种语言模式,将不同知识来源同化,形成统一的知识来源,用于创造知识基础对话的任务。 PLUG在对话生成任务方面接受过预先培训,但以统一的基本知识代表制为条件。它可以概括不同的基于知识的下游对话生成任务,并举几个培训实例。关于两个基准的经验评价表明,我们的模式在全方位的不同知识背景任务中非常普及。 在完全监督下,它可以实现与最新技术方法的可比较性,大大优于零点和微点环境中的其他方法。