Software architecture design is a fundamental part of creating every software system. Despite its importance, producing a C4 software architecture model, the preferred notation for such architecture, remains manual and time-consuming. We introduce an LLM-based multi-agent system that automates this task by simulating a dialogue between role-specific experts who analyze requirements and generate the Context, Container, and Component views of the C4 model. Quality is assessed with a hybrid evaluation framework: deterministic checks for structural and syntactic integrity and C4 rule consistency, plus semantic and qualitative scoring via an LLM-as-a-Judge approach. Tested on five canonical system briefs, the workflow demonstrates fast C4 model creation, sustains high compilation success, and delivers semantic fidelity. A comparison of four state-of-the-art LLMs shows different strengths relevant to architectural design. This study contributes to automated software architecture design and its evaluation methods.
翻译:软件架构设计是构建每个软件系统的基础环节。尽管其重要性不言而喻,生成C4软件架构模型(此类架构的首选表示法)的过程仍依赖于人工且耗时费力。本文提出一种基于LLM的多智能体系统,通过模拟特定角色专家之间的对话来自动化此任务,这些专家分析需求并生成C4模型的上下文、容器和组件视图。质量评估采用混合评价框架:通过确定性检查验证结构与句法完整性及C4规则一致性,并借助LLM-as-a-Judge方法进行语义与定性评分。在五个典型系统概要上的测试表明,该工作流能快速创建C4模型,保持较高的编译成功率,并实现良好的语义保真度。对四种前沿LLM的对比研究揭示了它们在架构设计领域的不同优势。本研究为自动化软件架构设计及其评估方法提供了新的贡献。