When people answer questions about a specific situation, e.g., "I cheated on my mid-term exam last week. Was that wrong?", cognitive science suggests that they form a mental picture of that situation before answering. While we do not know how language models (LMs) answer such questions, we conjecture that they may answer more accurately if they are also provided with additional details about the question situation, elaborating the "scene". To test this conjecture, we train a new model, DREAM, to answer questions that elaborate the scenes that situated questions are about, and then provide those elaborations as additional context to a question-answering (QA) model. We find that DREAM is able to create better scene elaborations (more accurate, useful, and consistent) than a representative state-of-the-art, zero-shot model (Macaw). We also find that using the scene elaborations as additional context improves the answer accuracy of a downstream QA system, including beyond that obtainable by simply further finetuning the QA system on DREAM's training data. These results suggest that adding focused elaborations about a situation can improve a system's reasoning about it, and may serve as an effective way of injecting new scenario based knowledge into QA models. Finally, our approach is dataset-neutral; we observe improved QA performance across different models, with even bigger gains on models with fewer parameters. We make our dataset and model publicly available at https://github.com/allenai/dream.
翻译:当人们回答关于特定情况的问题时,比如,“我上周在中期考试上作弊了,这是不对的吗?”认知科学表明,他们是在回答之前对这种情况进行心理上的描述。虽然我们不知道语言模型(LMs)是如何回答这些问题的,但我们推测,如果也向他们提供关于问题状况的更多细节,他们就能更准确地回答问题,制定“标准 ” 。为了测试这种推测,我们训练一个新的模型(DREAM),回答那些详细描述问题所在的场景的问题,然后将这些描述作为回答问题解答(QA)模式的附加背景。我们发现DREAM能够在回答问题(QA)模式(QA)模式之前提供更好的情景描述(更准确、有用和一致 ) 。 虽然我们不知道语言模型(LMs)是如何回答这些问题的,但是我们推测,如果他们也能更准确地回答问题,我们也可以用场景的描述来提高下游模式(QA)的答案的准确性, 包括仅仅对QA系统关于DREAM培训数据的数据进行更精确的调整。这些结果表明,我们最终能够将重点的推导出一个基于新的数据模型的系统, 改进我们的数据。