In question answering requiring common sense, language models (e.g., GPT-3) have been used to generate text expressing background knowledge that helps improve performance. Yet the cost of working with such models is very high; in this work, we finetune smaller language models to generate useful intermediate context, referred to here as elaborations. Our framework alternates between updating two language models -- an elaboration generator and an answer predictor -- allowing each to influence the other. Using less than 0.5% of the parameters of GPT-3, our model outperforms alternatives with similar sizes and closes the gap on GPT-3 on four commonsense question answering benchmarks. Human evaluations show that the quality of the generated elaborations is high.
翻译:在回答需要常识的问题时,语言模型(例如GPT-3)被用来生成表达有助于改进业绩的背景知识的文本;然而,与这些模型合作的成本非常高;在这项工作中,我们微调了较小的语言模型,以产生有用的中间环境,这里称之为阐述。我们的框架在更新两种语言模型 -- -- 一种是拟订的生成器,一种是回答的预测器 -- -- 之间互通影响。我们使用GPT-3参数中不到0.5%的参数,我们的模型优于类似尺寸的替代方法,并缩小了GPT-3在四个常见问题回答基准上的差距。人类评价显示,生成的阐述的质量很高。