Real human conversation data are complicated, heterogeneous, and noisy, from which building open-domain dialogue systems remains a challenging task. In fact, such dialogue data still contains a wealth of information and knowledge, however, they are not fully explored. In this paper, we show existing open-domain dialogue generation methods that memorize context-response paired data with autoregressive or encode-decode language models underutilize the training data. Different from current approaches, using external knowledge, we explore a retrieval-generation training framework that can take advantage of the heterogeneous and noisy training data by considering them as "evidence". In particular, we use BERTScore for retrieval, which gives better qualities of the evidence and generation. Experiments over publicly available datasets demonstrate that our method can help models generate better responses, even such training data are usually impressed as low-quality data. Such performance gain is comparable with those improved by enlarging the training set, even better. We also found that the model performance has a positive correlation with the relevance of the retrieved evidence. Moreover, our method performed well on zero-shot experiments, which indicates that our method can be more robust to real-world data.
翻译:真正的人类对话数据是复杂、多样和吵闹的,从中建立开放域对话系统仍然是一项艰巨的任务。事实上,这种对话数据仍然包含大量信息和知识,然而,它们并没有得到充分的探索。在本文中,我们展示了现有的开放域对话生成方法,这些方法将背景反应数据与自动递减或编码解码语言模型混为一模一样,没有充分利用培训数据。与目前的方法不同,我们利用外部知识探索了检索生成培训框架,这种框架可以通过将这些数据视为“证据”来利用多样性和吵闹的培训数据。特别是,我们使用BERTScore来检索这些数据,这提供了更好的证据和生成质量。对公开数据集的实验表明,我们的方法可以帮助模型产生更好的反应,即使这类培训数据通常也以低质量数据为印象。这种业绩收益与通过扩大培训数据集而改进的数据相比,甚至更好。我们还发现,模型性能与检索的证据的相关性有着积极的相关性。此外,我们用零镜头进行的实验表明,我们的方法可以更可靠到真实的数据。