Considerable progress has been made recently in open-domain question answering (QA) problems, which require Information Retrieval (IR) and Reading Comprehension (RC). A popular approach to improve the system's performance is to improve the quality of the retrieved context from the IR stage. In this work we show that for StrategyQA, a challenging open-domain QA dataset that requires multi-hop reasoning, this common approach is surprisingly ineffective -- improving the quality of the retrieved context hardly improves the system's performance. We further analyze the system's behavior to identify potential reasons.
翻译:最近,在公开回答问题(QA)方面取得了相当大的进展,这些问题需要信息检索和阅读理解系统(RC)来改进该系统的绩效。一种大众化的方法是提高从IR阶段检索到的环境的质量。在这项工作中,我们表明,对于战略QA来说,一个挑战性的公开回答问题(QA)数据集需要多点推理,这种共同的方法令人惊讶地无效 -- -- 改进检索环境的质量很难改善系统的业绩。我们进一步分析系统的行为以确定潜在原因。