FunReason-MT技术报告：突破多轮函数调用的复杂性壁垒 (FunReason-MT Technical Report: Overcoming the Complexity Barrier in Multi-Turn Function Calling)

Zengzhuang Xu,Bingguang Hao,Zechuan Wang,Yuntao Wen,Maolin Wang,Yang Liu,Long Chen,Dong Wang,Yicheng Chen,Cunyin Peng,Chenyi Zhuang,Jinjie Gu,Leilei Gan,Xiangyu Zhao,Shi Gu

Function calling (FC) empowers large language models (LLMs) and autonomous agents to interface with external tools, a critical capability for solving complex, real-world problems. As this ability becomes increasingly central to advanced AI systems, the need for high-quality, multi-turn training data to develop and refine it cannot be overstated. Existing data synthesis methods, such as random environment sampling or multi-agent role-playing, are not powerful enough to generate high-quality data in real-world environments. Practical challenges come in three folds: targeted model training, isolation of tool architecture, and multi-turn logical dependency. To address these structural deficiencies, we present FunReason-MT, a novel data synthesis framework for real-world multi-turn tool use. FunReason-MT resolves the complexity barrier in multi-turn FC data by employing 1) Environment-API Graph Interactions to gather varied high-quality trajectories, 2) Advanced Tool-Query Synthesis to simplify hard query construction, and 3) Guided Iterative Chain for sophisticated CoT generation. Evaluations on Berkeley Function-Calling Leaderboard (BFCLv3) demonstrate the power of our framework: a 4B model built upon FunReason-MT generated data achieves state-of-the-art performance among comparable-sized models, outperforming most close-source models. Further performance improvements on BFCLv4 confirm that FunReason-MT provides a reliable and robust source for agentic learning.

翻译：函数调用（FC）使大型语言模型（LLMs）和自主智能体能够与外部工具交互，这是解决复杂现实世界问题的关键能力。随着这一能力在先进人工智能系统中日益重要，对高质量多轮训练数据以开发和优化该能力的需求不容忽视。现有的数据合成方法，如随机环境采样或多智能体角色扮演，不足以在现实环境中生成高质量数据。实际挑战体现在三个方面：目标模型训练、工具架构隔离以及多轮逻辑依赖。为应对这些结构性不足，我们提出了FunReason-MT，一种面向现实世界多轮工具使用的新型数据合成框架。FunReason-MT通过采用1）环境-API图交互以收集多样化高质量轨迹，2）高级工具-查询合成以简化复杂查询构建，以及3）引导式迭代链以生成精细的思维链（CoT），从而解决了多轮FC数据中的复杂性壁垒。在伯克利函数调用排行榜（BFCLv3）上的评估证明了我们框架的有效性：基于FunReason-MT生成数据构建的4B参数模型在同等规模模型中实现了最先进的性能，超越了大多数闭源模型。在BFCLv4上的进一步性能提升证实了FunReason-MT为智能体学习提供了可靠且鲁棒的数据源。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日