Towards building intelligent dialogue agents, there has been a growing interest in introducing explicit personas in generation models. However, with limited persona-based dialogue data at hand, it may be difficult to train a dialogue generation model well. We point out that the data challenges of this generation task lie in two aspects: first, it is expensive to scale up current persona-based dialogue datasets; second, each data sample in this task is more complex to learn with than conventional dialogue data. To alleviate the above data issues, we propose a data manipulation method, which is model-agnostic to be packed with any persona-based dialogue generation model to improve its performance. The original training samples will first be distilled and thus expected to be fitted more easily. Next, we show various effective ways that can diversify such easier distilled data. A given base model will then be trained via the constructed data curricula, i.e. first on augmented distilled samples and then on original ones. Experiments illustrate the superiority of our method with two strong base dialogue models (Transformer encoder-decoder and GPT2).
翻译:建立智能对话代理机构,人们越来越有兴趣在生成模型中引入清晰的人性。然而,由于目前掌握的基于人的对话数据有限,很难对对话生成模型进行良好培训。我们指出,这一生成任务的数据挑战有两个方面:第一,扩大当前基于人的对话数据集的成本昂贵;第二,这项任务中的每个数据样本比常规对话数据更复杂,要学习比常规对话数据更为复杂。为了缓解上述数据问题,我们提议了一种数据操作方法,这是一种模型-遗传学方法,可以与任何基于人的对话生成模型一起包装,以提高其性能。最初的培训样本将首先蒸馏,从而更容易地安装。接下来,我们展示了能够使这种更简单的蒸馏数据多样化的各种有效方法。然后,一个特定的基础模型将通过构建的数据课程进行培训,即先是强化蒸馏样品,然后是原始数据模型。实验表明我们方法的优越性,有两个强大的基础对话模型(变异编码和GPT2)。