This paper introduces a simple yet effective data-centric approach for the task of improving persona-conditioned dialogue agents. Prior model-centric approaches unquestioningly depend on the raw crowdsourced benchmark datasets such as Persona-Chat. In contrast, we aim to fix annotation artifacts in benchmarking, which is orthogonally applicable to any dialogue model. Specifically, we augment relevant personas to improve dialogue dataset/agent, by leveraging the primal-dual structure of the two tasks, predicting dialogue responses and personas based on each other. Experiments on Persona-Chat show that our approach outperforms pre-trained LMs by an 11.7 point gain in terms of accuracy.
翻译:本文介绍了一种简单而有效的以数据为中心的方法,用于改进个人条件对话媒介的任务。先前的以模式为中心的方法毫无疑问地依赖于原始的多方源基准数据集,例如人与人之间的基准数据集。相反,我们的目标是在基准中固定注注物,这种基准在任何对话模式中都适用。具体地说,我们通过利用两项任务的原始双重结构,预测对话反应和基于对方的个人,来增加相关人员以改善对话数据集/代理人。 人与人之间的实验表明,我们的方法在准确性方面比预先培训的LMM高出11.7点。