A prominent challenge for modern language understanding systems is the ability to answer implicit reasoning questions, where the required reasoning steps for answering the question are not mentioned in the text explicitly. In this work, we investigate why current models struggle with implicit reasoning question answering (QA) tasks, by decoupling inference of reasoning steps from their execution. We define a new task of implicit relation inference and construct a benchmark, IMPLICITRELATIONS, where given a question, a model should output a list of concept-relation pairs, where the relations describe the implicit reasoning steps required for answering the question. Using IMPLICITRELATIONS, we evaluate models from the GPT-3 family and find that, while these models struggle on the implicit reasoning QA task, they often succeed at inferring implicit relations. This suggests that the bottleneck for answering implicit reasoning questions is in the ability of language models to retrieve and reason over information rather than to plan an accurate reasoning strategy
翻译:现代语言理解系统的一个突出挑战是能否回答隐含推理问题,在案文中没有明确提及回答问题所需的推理步骤。在这项工作中,我们调查当前模式为什么与隐含推理回答问题(QA)任务纠缠不休,方法是将推理步骤与执行过程中的推理步骤区分开来。 我们界定了隐含关系推论的新任务,并建立了一个基准,即IMPLICITRELONLONLAING(在遇到问题的情况下),一个模型应该产生一个概念-关系对等清单,其中各种关系描述了回答问题所需的隐含推理步骤。我们利用IMPLITRELONS,评估GPT-3家庭的模型,发现虽然这些模型在暗含推理回答QA任务上挣扎,但它们往往成功地推断隐含的推理关系。 这表明,解隐含推理问题的瓶颈在于语言模型能够检索和解释信息,而不是规划准确推理战略。