Large Language Models' capacity to reason in natural language makes them uniquely promising for 4X and grand strategy games, enabling more natural human-AI gameplay interactions such as collaboration and negotiation. However, these games present unique challenges due to their complexity and long-horizon nature, while latency and cost factors may hinder LLMs' real-world deployment. Working on a classic 4X strategy game, Sid Meier's Civilization V with the Vox Populi mod, we introduce Vox Deorum, a hybrid LLM+X architecture. Our layered technical design empowers LLMs to handle macro-strategic reasoning, delegating tactical execution to subsystems (e.g., algorithmic AI or reinforcement learning AI in the future). We validate our approach through 2,327 complete games, comparing two open-source LLMs with a simple prompt against Vox Populi's enhanced AI. Results show that LLMs achieve competitive end-to-end gameplay while exhibiting play styles that diverge substantially from algorithmic AI and from each other. Our work establishes a viable architecture for integrating LLMs in commercial 4X games, opening new opportunities for game design and agentic AI research.
翻译:大型语言模型运用自然语言进行推理的能力,使其在4X和大型战略游戏中展现出独特潜力,能够实现更自然的人机游戏交互,如协作与谈判。然而,这类游戏因其复杂性和长时程特性带来独特挑战,同时延迟与成本因素可能阻碍LLM的实际部署。我们以经典4X战略游戏《席德·梅尔的文明V》(搭载Vox Populi模组)为实验平台,提出了Vox Deorum——一种混合LLM+X架构。我们的分层技术设计使LLM能够处理宏观战略推理,并将战术执行任务委托给子系统(例如算法AI或未来的强化学习AI)。我们通过2,327场完整游戏对方法进行验证,将两种开源LLM的简易提示方案与Vox Populi增强型AI进行对比。结果表明,LLM在实现具有竞争力的端到端游戏表现的同时,展现出与算法AI及彼此间显著差异化的游戏风格。本研究为LLM在商业4X游戏中的集成建立了可行架构,为游戏设计与智能体AI研究开辟了新机遇。