Recently, pre-training methods have shown remarkable success in task-oriented dialog (TOD) systems. However, most existing pre-trained models for TOD focus on either dialog understanding or dialog generation, but not both. In this paper, we propose SPACE-3, a novel unified semi-supervised pre-trained conversation model learning from large-scale dialog corpora with limited annotations, which can be effectively fine-tuned on a wide range of downstream dialog tasks. Specifically, SPACE-3 consists of four successive components in a single transformer to maintain a task-flow in TOD systems: (i) a dialog encoding module to encode dialog history, (ii) a dialog understanding module to extract semantic vectors from either user queries or system responses, (iii) a dialog policy module to generate a policy vector that contains high-level semantics of the response, and (iv) a dialog generation module to produce appropriate responses. We design a dedicated pre-training objective for each component. Concretely, we pre-train the dialog encoding module with span mask language modeling to learn contextualized dialog information. To capture the structured dialog semantics, we pre-train the dialog understanding module via a novel tree-induced semi-supervised contrastive learning objective with the help of extra dialog annotations. In addition, we pre-train the dialog policy module by minimizing the L2 distance between its output policy vector and the semantic vector of the response for policy optimization. Finally, the dialog generation model is pre-trained by language modeling. Results show that SPACE-3 achieves state-of-the-art performance on eight downstream dialog benchmarks, including intent prediction, dialog state tracking, and end-to-end dialog modeling. We also show that SPACE-3 has a stronger few-shot ability than existing models under the low-resource setting.
翻译:最近,培训前的方法在面向任务的对话(TOD)系统中表现出了显著的成功。然而,大多数现有的培训前TOD模式侧重于对对话框的理解或对话生成,但并非两者兼而有。在本文件中,我们提议Seace-3,这是一个新的统一的半监督前培训对话模式,从大型对话框中学习,带有有限的注释,可以有效地对广泛的下游对话任务进行微调。具体地说,Sace-3包含一个单一变压器的四个连续组件,以维持在TOD系统中的任务流:(一)一个用于编码对话框历史的对话框编码模块,(二)一个用于从用户询问或系统回应中提取语义矢量矢量矢量矢量矢量矢量的对话框模块,(三)一个包含对响应进行高层次描述的半监督前培训的谈话模式,(四)一个为每个组件设计一个专门的培训前目标。具体地说,我们通过模拟系统模拟的对话框编码模块来学习背景化的对话框信息。为了捕捉到结构化的对话框结构式对话模式,我们先先是通过升级的生成的对话框的对话框,然后通过新版本的生成的生成的对话框,我们通过升级的变式的变式的变式变式变式变式变式的变式变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的动作,我们的变式的变式的变式的变式显示的动作显示了其变式的动作显示的动作显示的生成式的动作显示的变式的动作的动作显示的变式模模模模。