Building dialogue generation systems in a zero-shot scenario remains a huge challenge, since the typical zero-shot approaches in dialogue generation rely heavily on large-scale pre-trained language generation models such as GPT-3 and T5. The research on zero-shot dialogue generation without cumbersome language models is limited due to lacking corresponding parallel dialogue corpora. In this paper, we propose a simple but effective Multilingual learning framework for Zero-shot Dialogue Generation (dubbed as MulZDG) that can effectively transfer knowledge from an English corpus with large-scale training samples to a non-English corpus with zero samples. Besides, MulZDG can be viewed as a multilingual data augmentation method to improve the performance of the resource-rich language. First, we construct multilingual code-switching dialogue datasets via translation utterances randomly selected from monolingual English datasets. Then we employ MulZDG to train a unified multilingual dialogue model based on the code-switching datasets. The MulZDG can conduct implicit semantic alignment between different languages. Experiments on DailyDialog and DSTC7 datasets demonstrate that MulZDG not only achieve competitive performance under zero-shot case compared to training with sufficient examples but also greatly improve the performance of the source language.
翻译:在零发情况下建立对话生成系统仍是一项巨大的挑战,因为在对话生成中典型的零点方法严重依赖诸如GPT-3和T5等大规模预先培训的语文生成模型。由于缺少相应的平行对话公司,因此对没有繁琐语言模型的零点对话生成的研究有限。在本文件中,我们为零发对话生成(以MulZDG为底盘)提出了一个简单而有效的多语种学习框架,能够有效地将知识从拥有大规模培训样本的英语系统向没有零样样的非英语系统转移。此外,MulZDG可被视为一种多语言数据增强方法,以提高资源丰富语言的性能。首先,我们通过随机从单一语言英语数据集中随机选取的翻译超文本来构建多语种代码转换对话数据集。然后,我们利用MulZDG为基于代码转换数据集的统一多语种对话模式提供培训。MulZDG可以对不同语言进行隐含的语义调整。Daidialog和DSTC7数据集的实验还表明,MulZDDDG不仅能够大大改进了在零发案例下的竞争性业绩,而且比照了充分的样本。