In this paper, we propose to formulate the task-oriented dialogue system as the purely natural language generation task, so as to fully leverage the large-scale pre-trained models like GPT-2 and simplify complicated delexicalization prepossessing. However, directly applying this method heavily suffers from the dialogue entity inconsistency caused by the removal of delexicalized tokens, as well as the catastrophic forgetting problem of the pre-trained model during fine-tuning, leading to unsatisfactory performance. To alleviate these problems, we design a novel GPT-Adapter-CopyNet network, which incorporates the lightweight adapter and CopyNet modules into GPT-2 to achieve better performance on transfer learning and dialogue entity generation. Experimental results conducted on the DSTC8 Track 1 benchmark and MultiWOZ dataset demonstrate that our proposed approach significantly outperforms baseline models with a remarkable performance on automatic and human evaluations.
翻译:在本文中,我们提议将面向任务的对话系统作为纯粹自然语言生成的任务,以便充分利用诸如GPT-2等大型预先培训模式,并简化复杂的非灵活化预占模式,然而,直接采用这种方法在很大程度上受到对话实体不一致的影响,因为去除了非灵活化的象征物,以及由于在微调过程中预先培训模式的灾难性遗忘问题,导致工作表现不尽人意。为了缓解这些问题,我们设计了一个新型的GPT-Adapter-CopyNet网络,该网络将轻量级适应器和复制网络模块纳入GPT-2,以便在转移学习和对话实体生成方面实现更好的业绩。 DSTC8轨道1基准和多功能区数据集的实验结果表明,我们拟议的方法大大优于基线模型,在自动和人文评估上表现显著。