Autoregressive models used to generate responses in open-domain dialogue systems often struggle to take long-term context into account and to maintain consistency over a dialogue. Previous research in open-domain dialogue generation has shown that the use of \emph{auxiliary tasks} can introduce inductive biases that encourage the model to improve these qualities. However, most previous research has focused on encoder-only or encoder/decoder models, while the use of auxiliary tasks in \emph{decoder-only} autoregressive models is under-explored. This paper describes an investigation where four different auxiliary tasks are added to small and medium-sized GPT-2 models fine-tuned on the PersonaChat and DailyDialog datasets. The results show that the introduction of the new auxiliary tasks leads to small but consistent improvement in evaluations of the investigated models.
翻译:在生成开放域对话系统中的响应时,自回归模型通常很难考虑到长期上下文和对话的一致性。 先前在开放域对话生成中的研究表明,使用“辅助任务”可以引入归纳偏差,鼓励模型提高这些质量。 然而,大多数先前的研究都集中在仅编码器或编码器/解码器模型上,而在解码器/自回归模型中使用辅助任务的方法尚未得到深入研究。 本文介绍了一项调查,其中将四个不同的辅助任务添加到在PersonaChat和DailyDialog数据集上进行微调的小型和中型GPT-2模型中。结果显示,引入新的辅助任务会在所研究的模型评估中导致小但一致的改进。