Large Language Models can break through knowledge and timeliness limitations by invoking external tools within the Model Context Protocol framework to achieve automated execution of complex tasks. However, with the rapid growth of enterprise-scale MCP services, efficiently and accurately matching target functionalities among thousands of heterogeneous tools has become a core challenge restricting system practicality. Existing approaches generally rely on full-prompt injection or static semantic retrieval, facing issues including semantic disconnection between user queries and tool descriptions, context inflation in LLM input, and high inference latency. To address these challenges, this paper proposes Z-Space, a data-generation-oriented multi-agent collaborative tool invocation framework Z-Space. The Z-Space framework establishes a multi-agent collaborative architecture and tool filtering algorithm: (1) A structured semantic understanding of user queries is achieved through an intent parsing model; (2) A tool filtering module (FSWW) based on fused subspace weighted algorithm realizes fine-grained semantic alignment between intents and tools without parameter tuning; (3) An inference execution agent is constructed to support dynamic planning and fault-tolerant execution for multi-step tasks. This framework has been deployed in the Eleme platform's technical division, serving large-scale test data generation scenarios across multiple business units including Taotian, Gaode, and Hema. Production data demonstrates that the system reduces average token consumption in tool inference by 96.26\% while achieving a 92\% tool invocation accuracy rate, significantly enhancing the efficiency and reliability of intelligent test data generation systems.
翻译:大型语言模型可通过在模型上下文协议框架内调用外部工具,突破知识及时效性限制,实现复杂任务的自动化执行。然而,随着企业级MCP服务的快速增长,在数千个异构工具中高效精准地匹配目标功能,已成为制约系统实用性的核心挑战。现有方法通常依赖全提示注入或静态语义检索,面临用户查询与工具描述间的语义割裂、LLM输入上下文膨胀及高推理延迟等问题。为应对这些挑战,本文提出Z-Space——一种面向数据生成的多智能体协同工具调用框架。Z-Space框架构建了多智能体协同架构与工具筛选算法:(1)通过意图解析模型实现用户查询的结构化语义理解;(2)基于融合子空间加权算法的工具筛选模块(FSWW)无需参数调优即可实现意图与工具间的细粒度语义对齐;(3)构建推理执行智能体以支持多步骤任务的动态规划与容错执行。该框架已部署于饿了么平台技术部门,服务于包括淘天、高德、盒马在内的多业务单元大规模测试数据生成场景。生产数据表明,系统在实现92%工具调用准确率的同时,将工具推理的平均token消耗降低96.26%,显著提升了智能测试数据生成系统的效率与可靠性。