Humans possess an extraordinary ability to create and utilize tools, allowing them to overcome physical limitations and explore new frontiers. With the advent of foundation models, AI systems have the potential to be equally adept in tool use as humans. This paradigm, i.e., tool learning with foundation models, combines the strengths of specialized tools and foundation models to achieve enhanced accuracy, efficiency, and automation in problem-solving. Despite its immense potential, there is still a lack of a comprehensive understanding of key challenges, opportunities, and future endeavors in this field. To this end, we present a systematic investigation of tool learning in this paper. We first introduce the background of tool learning, including its cognitive origins, the paradigm shift of foundation models, and the complementary roles of tools and models. Then we recapitulate existing tool learning research into tool-augmented and tool-oriented learning. We formulate a general tool learning framework: starting from understanding the user instruction, models should learn to decompose a complex task into several subtasks, dynamically adjust their plan through reasoning, and effectively conquer each sub-task by selecting appropriate tools. We also discuss how to train models for improved tool-use capabilities and facilitate the generalization in tool learning. Considering the lack of a systematic tool learning evaluation in prior works, we experiment with 17 representative tools and show the potential of current foundation models in skillfully utilizing tools. Finally, we discuss several open problems that require further investigation for tool learning. Overall, we hope this paper could inspire future research in integrating tools with foundation models.
翻译:人类拥有创建和利用工具的非凡能力,使他们能够克服物理限制并探索新的领域。随着基础模型的出现,人工智能系统也有可能像人类一样熟练地使用工具。这种范式,即利用基础模型进行工具学习,将专业工具和基础模型的优势相结合,以实现问题解决的精度、效率和自动化的提高。尽管具有巨大的潜力,但在这个领域还缺乏对关键挑战、机遇和未来事业的全面理解。因此,本文对工具学习进行了系统的研究。首先介绍工具学习的背景,包括它的认知起源、基础模型的范式变革以及工具和模型的互补作用。然后我们回顾现有的工具学习研究,包括工具增强学习和工具导向学习。我们制定了一个通用的工具学习框架:从理解用户指令开始,模型应该学习将复杂任务分解为多个子任务,通过推理动态调整计划,并通过选择合适的工具有效地解决每个子任务。我们还讨论了如何训练模型以提高其运用工具的能力,并促进工具学习的泛化。考虑到之前的研究缺乏系统的工具学习评估,我们用17个代表性工具进行实验,并展示当前基础模型在熟练使用工具方面的潜力。最后,我们讨论了几个需要进一步研究的未解决问题,以促进未来在将工具与基础模型相结合的研究中的进一步探索。总的来说,我们希望本文能激发未来工具与基础模型相集成的研究的灵感。