Conditioned dialogue generation suffers from the scarcity of labeled responses. In this work, we exploit labeled non-dialogue text data related to the condition, which are much easier to collect. We propose a multi-task learning approach to leverage both labeled dialogue and text data. The 3 tasks jointly optimize the same pre-trained Transformer -- conditioned dialogue generation task on the labeled dialogue data, conditioned language encoding task and conditioned language generation task on the labeled text data. Experimental results show that our approach outperforms the state-of-the-art models by leveraging the labeled texts, and it also obtains larger improvement in performance comparing to the previous methods to leverage text data.
翻译:在这项工作中,我们利用了与该条件有关的标签非对话文本数据,这些数据比较容易收集。我们建议采用多任务学习方法来利用标签式对话和文本数据。这三项任务共同优化了相同的预先培训的变换器 -- -- 即以标签式对话数据为主的有条件对话生成任务、有条件语言编码任务和以标签式文本数据为主的有条件语言生成任务。实验结果显示,我们的方法通过利用标签式文本,优于最先进的模型,而且与以往的利用文本数据的方法相比,业绩也得到了更大的改进。