With the power of large pretrained language models, various research works have integrated knowledge into dialogue systems. The traditional techniques treat knowledge as part of the input sequence for the dialogue system, prepending a set of knowledge statements in front of dialogue history. However, such a mechanism forces knowledge sets to be concatenated in an ordered manner, making models implicitly pay imbalanced attention to the sets during training. In this paper, we first investigate how the order of the knowledge set can influence autoregressive dialogue systems' responses. We conduct experiments on two commonly used dialogue datasets with two types of transformer-based models and find that models view the input knowledge unequally. To this end, we propose a simple and novel technique to alleviate the order effect by modifying the position embeddings of knowledge input in these models. With the proposed position embedding method, the experimental results show that each knowledge statement is uniformly considered to generate responses.
翻译:借助大型预先培训的语言模型的力量,各种研究工作将知识纳入对话系统。传统技术将知识作为对话系统输入序列的一部分,在对话历史之前预先准备一套知识声明。然而,这种机制迫使知识组以有秩序的方式融合,使模型在培训期间对数据集给予不相称的注意。在本文件中,我们首先调查知识组的顺序如何影响自动递减对话系统的反应。我们用两种基于变压器的模型对两种常用的对话数据集进行实验,发现模型对输入知识的看法是不平等的。为此,我们提出一种简单而新的方法,通过改变这些模型中知识投入的嵌入位置来减轻秩序效应。在拟议定位嵌入方法下,实验结果显示,每个知识说明都被一致考虑产生反应。