Multi-document grounded dialogue systems (DGDS) belong to a class of conversational agents that answer users' requests by finding supporting knowledge from a collection of documents. Most previous studies aim to improve the knowledge retrieval model or propose more effective ways to incorporate external knowledge into a parametric generation model. These methods, however, focus on retrieving knowledge from mono-granularity language units (e.g. passages, sentences, or spans in documents), which is not enough to effectively and efficiently capture precise knowledge in long documents. This paper proposes Re3G, which aims to optimize both coarse-grained knowledge retrieval and fine-grained knowledge extraction in a unified framework. Specifically, the former efficiently finds relevant passages in a retrieval-and-reranking process, whereas the latter effectively extracts finer-grain spans within those passages to incorporate into a parametric answer generation model (BART, T5). Experiments on DialDoc Shared Task demonstrate the effectiveness of our method.
翻译:多文件对话系统(DGDS)属于一类对话代理机构,它们通过从文件集中找到支持性知识来回应用户的要求,大多数先前的研究都旨在改进知识检索模式,或提出更有效的方法,将外部知识纳入参数生成模式,但是,这些方法侧重于从单一语系语言单元(例如段落、句子或文件篇幅)检索知识,这不足以切实有效地在长篇文件中获取准确的知识。本文提议了Re3G,其目的是在统一框架内优化粗略知识检索和精细提取知识的提取。具体地说,前者在检索和重新排序过程中找到相关段落,而后者有效地提取了在这些段落内的细微拼图,以纳入参数生成模式(BART,T5)。