不受监督的“以知识为基础的知识对话”的变式学习 (Variational Learning for Unsupervised Knowledge Grounded Dialogs)

Recent methods for knowledge grounded dialogs generate responses by incorporating information from an external textual document. These methods do not require the exact document to be known during training and rely on the use of a retrieval system to fetch relevant documents from a large index. The documents used to generate the responses are modeled as latent variables whose prior probabilities need to be estimated. Models such as RAG and REALM, marginalize the document probabilities over the documents retrieved from the index to define the log likelihood loss function which is optimized end-to-end. In this paper, we develop a variational approach to the above technique wherein, we instead maximize the Evidence Lower bound (ELBO). Using a collection of three publicly available open-conversation datasets, we demonstrate how the posterior distribution, that has information from the ground-truth response, allows for a better approximation of the objective function during training. To overcome the challenges associated with sampling over a large knowledge collection, we develop an efficient approach to approximate the ELBO. To the best of our knowledge we are the first to apply variational training for open-scale unsupervised knowledge grounded dialog systems.

翻译：知识基础对话框的近期方法通过纳入外部文本文档中的信息而产生回应。这些方法不需要在培训期间了解确切的文件,而依靠使用检索系统从大指数中获取相关文件。用于生成回复的文件建模为需要事先估计概率的潜在变量。 RAG 和 REALM 等模型使文件概率比从索引中提取的文件的概率边缘化,从而界定记录概率损失功能,这是最佳的终端到终端。在本文中,我们对上述技术开发了一种变式方法,即我们尽量扩大证据较低约束(ELBO) 。我们利用三种公开提供的公开公开公开可查读数据集的集合,我们演示从地面图解反应中获取信息的后方分布如何更好地接近培训中的目标功能。为了克服与对大型知识收集进行取样相关的挑战,我们开发了一种高效的方法来接近ELBO。我们最了解的是,我们首先对开放规模的、不受监督的知识基础对话框系统应用变式培训。