Pre-trained language models (PLM) have marked a huge leap in neural dialogue modeling. While PLMs are pre-trained on large-scale text corpora, they are usually fine-tuned on scarce dialogue data with specific domain knowledge and dialogue styles. However, tailoring the language models while fully utilizing prior knowledge in large pre-trained models remains a challenge. In this paper, we present a novel approach for pre-trained dialogue modeling that casts the dialogue generation problem as a prompt-learning task. Instead of fine-tuning on limited dialogue data, our approach, DialogPrompt, learns continuous prompt embeddings optimized for dialogue contexts, which appropriately elicit knowledge from the large pre-trained model. To encourage the model to better utilize the prompt embeddings, the prompt encoders are designed to be conditioned on the input dialogue context. Experiments on popular conversation datasets show that our approach significantly outperforms the fine-tuning baseline and the generic prompt-learning methods. Furthermore, human evaluations strongly support the superiority of DialogPrompt in regard to response generation quality.
翻译:预先培训的语言模型(PLM)在神经对话模型方面标志着巨大的飞跃。虽然PLM公司在大规模文本公司方面接受过预先培训,但通常会根据特定领域知识和对话风格对稀缺的对话数据进行微调。然而,在经过培训的大型模型中,在充分利用先前知识的同时,对语言模型进行裁剪,这仍然是一个挑战。在本文中,我们提出了一个将对话生成问题作为快速学习任务的预培训对话模型的新办法。我们的方法( DialogPrompt)不是对有限的对话数据进行微调,而是不断为对话环境进行优化的快速嵌入,从而适当地从经过培训的大型模型中获取知识。为了鼓励模型更好地利用快速嵌入,快速编码器的设计要以输入对话环境为条件。对大众对话数据集的实验表明,我们的方法大大超出了调整基线和通用的快速学习方法。此外,人类评估有力地支持 DialogPrompt在反应生成质量方面的优势。