Transformer based language models (LMs) demonstrate increasing performance with scale across a wide variety of tasks. Scale alone however cannot enable models to solve tasks that require access to ephemeral, changing, or private data that was unavailable at training time. Many useful tasks may also benefit from LMs being able to access APIs that read or modify state. In this work, we present Tool Augmented Language Models (TALM), combining a text-only approach to augment language models with non-differentiable tools, and an iterative "self-play" technique to bootstrap performance starting from few tool demonstrations. TALM exhibits strong performance on both a knowledge-heavy QA task and a reasoning oriented math task with simple tools. At a given model scale, TALM significantly outperforms non-augmented LMs. We further demonstrate that TALM successfully performs out-of-distribution inferences on both QA and math tasks, where non-augmented LMs fail. Our results suggest that Tool Augmented Language Models are a promising direction to enrich LMs' capabilities, with less dependence on scale.
翻译:基于变异语言模型(LMS) 显示,基于语言的变异模型(LMS) 的性能在规模上不断提高,涉及各种各样的任务。然而,单靠规模本身,无法使模型解决需要获得培训时无法获得的短期、变化或私人数据的任务。许多有用的任务也可能受益于LMS能够访问读或变异状态的API。在这项工作中,我们介绍了工具增强语言模型(TALM),将只使用文本的方法与非差异工具相结合,以及从少数工具演示开始的反复“自玩”技术来套牢功能。TALM 展示了知识重QA任务和以推理为主的数学任务方面的强效性。在一定的模型中,TALM大大超越了非推荐性LMs。我们进一步证明,TALM 成功完成了对QA和数学任务的分配推导出的结果,在非推荐性LMS失败的情况下。我们的结果表明,工具增强语言模型是丰富LMS能力的有希望的方向,在规模上不那么依赖性。