Traditional systems designed for task oriented dialog utilize knowledge present only in structured knowledge sources to generate responses. However, relevant information required to generate responses may also reside in unstructured sources, such as documents. Recent state of the art models such as HyKnow and SeKnow aimed at overcoming these challenges make limiting assumptions about the knowledge sources. For instance, these systems assume that certain types of information, such as a phone number, is always present in a structured KB while information about aspects such as entrance ticket prices would always be available in documents. In this paper, we create a modified version of the MutliWOZ based dataset prepared by SeKnow to demonstrate how current methods have significant degradation in performance when strict assumptions about the source of information are removed. Then, in line with recent work exploiting pre-trained language models, we fine-tune a BART based model using prompts for the tasks of querying knowledge sources, as well as, for response generation, without making assumptions about the information present in each knowledge source. Through a series of experiments, we demonstrate that our model is robust to perturbations to knowledge modality (source of information), and that it can fuse information from structured as well as unstructured knowledge to generate responses.
翻译:为任务导向对话设计的传统系统只利用结构化知识来源中的知识来作出反应。然而,生成答复所需的相关信息可能也存在于非结构化来源,如文件等,如HyKnow和SeKnow等旨在克服这些挑战的最新先进模型对知识来源的假设有限。例如,这些系统假定,某些类型的信息,如电话号码,总是存在于结构化的KB中,而关于入口票价等方面的信息总是在文件中提供。在本文中,我们创建了由SeKnow公司编制的基于MutliWOZ数据集的修改版本,以表明在取消对信息来源的严格假设时,当前方法如何在业绩方面出现显著退化。随后,根据最近利用预先培训的语言模型的工作,我们微调了基于BART的模型,利用查询知识来源任务的即时速,以及生成答复时不假定每个知识来源中存在的信息。通过一系列试验,我们证明我们的模型对知识模式(信息来源)的渗透力很强,可以将信息从结构上和结构上整合到非结构上的知识反应。