Recently, two approaches, fine-tuning large pre-trained language models and variational training, have attracted significant interests, separately, for semi-supervised end-to-end task-oriented dialog (TOD) systems. In this paper, we propose Variational Latent-State GPT model (VLS-GPT), which is the first to combine the strengths of the two approaches. Among many options of models, we propose the generative model and the inference model for variational learning of the end-to-end TOD system, both as auto-regressive language models based on GPT-2, which can be further trained over a mix of labeled and unlabeled dialog data in a semi-supervised manner. Variational training of VLS-GPT is both statistically and computationally more challenging than previous variational learning works for sequential latent variable models, which use turn-level first-order Markovian. The inference model in VLS-GPT is non-Markovian due to the use of the Transformer architecture. In this work, we establish Recursive Monte Carlo Approximation (RMCA) to the variational objective with non-Markovian inference model and prove its unbiasedness. Further, we develop the computational strategy of sampling-then-forward-computation to realize RMCA, which successfully overcomes the memory explosion issue of using GPT in variational learning and speeds up training. Semi-supervised TOD experiments are conducted on two benchmark multi-domain datasets of different languages - MultiWOZ2.1 and CrossWOZ. VLS-GPT is shown to significantly outperform both supervised-only and semi-supervised self-training baselines.
翻译:最近,两种方法,即微调大型预先培训的语言模型和变式培训,分别吸引了半监督端对端任务导向对话框(TOD)系统的极大兴趣。在本文件中,我们提议了半监督端对端任务导向对话框(TOD)的混合数据。 VLS-GPT(VLS-GPT)的动态培训在统计上和计算上都比先前的动态潜在变量模型变异学习工作更具挑战性。在多种模型的许多选项中,我们提议了变异模型和变异学习TOD系统变异模型,两者都是基于GPT-2的自动递增语言模型。GPT2的多变性语言可以以半监督方式对标签和无标签的终端任务导向对话框数据进行进一步的培训。 VLS-GPT(VLS-GGPT)的动态培训模式比以往的变异性学习工作更具挑战性。 VLS-GPT(G-GPT)的推导模型是非马尔基公司,因为使用变异性结构。在这项工作中,我们建立了REC-recurvious-C-LAD-LOD-S-LOD-S-S-S-Servicevoriz-S-S)的升级战略,在测试中,在测试中进行更变现。