Can an AI autonomously design mechanisms for computer systems on par with the creativity and reasoning of human experts? We present Glia, an AI architecture for networked systems design that uses large language models (LLMs) in a human-inspired, multi-agent workflow. Each agent specializes in reasoning, experimentation, and analysis, collaborating through an evaluation framework that grounds abstract reasoning in empirical feedback. Unlike prior ML-for-systems methods that optimize black-box policies, Glia generates interpretable designs and exposes its reasoning process. When applied to a distributed GPU cluster for LLM inference, it produces new algorithms for request routing, scheduling, and auto-scaling that perform at human-expert levels in significantly less time, while yielding novel insights into workload behavior. Our results suggest that by combining reasoning LLMs with structured experimentation, an AI can produce creative and understandable designs for complex systems problems.
翻译:人工智能能否自主设计计算机系统机制,达到与人类专家相媲美的创造力和推理能力?我们提出Glia,一种用于网络化系统设计的人工智能架构,它采用大型语言模型(LLMs)在受人类启发的多智能体工作流中运行。每个智能体专精于推理、实验和分析,通过一个将抽象推理基于实证反馈的评估框架进行协作。与先前优化黑盒策略的机器学习用于系统方法不同,Glia生成可解释的设计并公开其推理过程。当应用于分布式GPU集群进行LLM推理时,它产生了用于请求路由、调度和自动伸缩的新算法,这些算法在显著更短的时间内达到人类专家水平,同时为工作负载行为提供了新颖的见解。我们的结果表明,通过将推理型LLMs与结构化实验相结合,人工智能能够为复杂系统问题产生创造性且易于理解的设计。