Federated Learning (FL) offers a powerful paradigm for training models on decentralized data, but its promise is often undermined by the immense complexity of designing and deploying robust systems. The need to select, combine, and tune strategies for multifaceted challenges like data heterogeneity and system constraints has become a critical bottleneck, resulting in brittle, bespoke solutions. To address this, we introduce Helmsman, a novel multi-agent system that automates the end-to-end synthesis of federated learning systems from high-level user specifications. It emulates a principled research and development workflow through three collaborative phases: (1) interactive human-in-the-loop planning to formulate a sound research plan, (2) modular code generation by supervised agent teams, and (3) a closed-loop of autonomous evaluation and refinement in a sandboxed simulation environment. To facilitate rigorous evaluation, we also introduce AgentFL-Bench, a new benchmark comprising 16 diverse tasks designed to assess the system-level generation capabilities of agentic systems in FL. Extensive experiments demonstrate that our approach generates solutions competitive with, and often superior to, established hand-crafted baselines. Our work represents a significant step towards the automated engineering of complex decentralized AI systems.
翻译:联邦学习(FL)为在分散数据上训练模型提供了强大的范式,但其潜力常因设计和部署鲁棒系统的巨大复杂性而难以实现。针对数据异构性和系统约束等多方面挑战,需要选择、组合并调整策略,这已成为关键瓶颈,导致产生脆弱且定制化的解决方案。为此,我们提出了Helmsman,一种新颖的多智能体系统,能够根据高层用户需求自动化端到端合成联邦学习系统。它通过三个协作阶段模拟了规范的研究与开发工作流程:(1)交互式人在回路规划以制定合理的研究计划;(2)由监督式智能体团队进行模块化代码生成;(3)在沙盒仿真环境中进行自主评估与优化的闭环迭代。为支持严格评估,我们还提出了AgentFL-Bench,这是一个包含16个多样化任务的新基准,旨在评估智能体系统在联邦学习中的系统级生成能力。大量实验表明,我们的方法生成的解决方案与现有手工构建的基线模型相比具有竞争力,且通常更优。我们的工作标志着在复杂去中心化人工智能系统的自动化工程领域迈出了重要一步。