In many real-world scenarios, the absence of external knowledge source like Wikipedia restricts question answering systems to rely on latent internal knowledge in limited dialogue data. In addition, humans often seek answers by asking several questions for more comprehensive information. As the dialog becomes more extensive, machines are challenged to refer to previous conversation rounds to answer questions. In this work, we propose to leverage latent knowledge in existing conversation logs via a neural Retrieval-Reading system, enhanced with a TFIDF-based text summarizer refining lengthy conversational history to alleviate the long context issue. Our experiments show that our Retrieval-Reading system can exploit retrieved background knowledge to generate significantly better answers. The results also indicate that our context summarizer significantly helps both the retriever and the reader by introducing more concise and less noisy contextual information.
翻译:在许多现实世界的情景中,维基百科等外部知识源的缺乏限制了问题解答系统在有限的对话数据中依赖潜在的内部知识。此外,人类往往通过询问几个问题来寻找答案,以获得更全面的信息。随着对话的扩大,机器将面临挑战,要参照以往的谈话回合来回答问题。在这项工作中,我们提议通过神经检索检索-读取系统利用现有对话日志中的潜伏知识,该系统以基于TRIDF的文本摘要器来强化,对冗长的谈话历史进行精细的改进,以缓解长期的上下文问题。我们的实验表明,我们的检索检索-检索系统可以利用检索到的背景知识来产生更好的答案。结果还表明,我们的上下文摘要器通过引入更简洁和不那么吵的背景资料来极大地帮助检索者和读者。