网络世界模型 (Web World Models)

Language agents increasingly require persistent worlds in which they can act, remember, and learn. Existing approaches sit at two extremes: conventional web frameworks provide reliable but fixed contexts backed by databases, while fully generative world models aim for unlimited environments at the expense of controllability and practical engineering. In this work, we introduce the Web World Model (WWM), a middle ground where world state and ``physics'' are implemented in ordinary web code to ensure logical consistency, while large language models generate context, narratives, and high-level decisions on top of this structured latent state. We build a suite of WWMs on a realistic web stack, including an infinite travel atlas grounded in real geography, fictional galaxy explorers, web-scale encyclopedic and narrative worlds, and simulation- and game-like environments. Across these systems, we identify practical design principles for WWMs: separating code-defined rules from model-driven imagination, representing latent state as typed web interfaces, and utilizing deterministic generation to achieve unlimited but structured exploration. Our results suggest that web stacks themselves can serve as a scalable substrate for world models, enabling controllable yet open-ended environments. Project Page: https://github.com/Princeton-AI2-Lab/Web-World-Models.

翻译：语言智能体日益需要在能够行动、记忆和学习的持久化世界中运行。现有方法处于两个极端：传统Web框架通过数据库提供可靠但固定的上下文环境，而完全生成式世界模型则以牺牲可控性和工程实用性为代价追求无限环境。本研究提出网络世界模型（WWM），这是一种折中方案：世界状态和"物理规则"通过常规Web代码实现以确保逻辑一致性，而大型语言模型则在此结构化潜状态基础上生成上下文、叙事和高层决策。我们在真实Web技术栈上构建了一系列WWM系统，包括基于真实地理的无限旅行图集、虚构星系探索器、网络级百科全书与叙事世界，以及模拟与游戏化环境。通过这些系统，我们总结出WWM的实用设计原则：将代码定义的规则与模型驱动的想象分离，将潜状态表示为类型化Web接口，并利用确定性生成实现无限但有结构的探索。研究结果表明，Web技术栈本身可作为世界模型的可扩展基础，实现可控且开放的环境。项目页面：https://github.com/Princeton-AI2-Lab/Web-World-Models。