Agentic exploration, letting LLM-powered agents branch, backtrack, and search across many execution paths, demands systems support well beyond today's pass-at-k resets. Our benchmark of six snapshot/restore mechanisms shows that generic tools such as CRIU or container commits are not fast enough even in isolated testbeds, and they crumble entirely in real deployments where agents share files, sockets, and cloud APIs with other agents and human users. In this talk, we pinpoint three open fundamental challenges: fork semantics, which concerns how branches reveal or hide tentative updates; external side-effects, where fork awareness must be added to services or their calls intercepted; and native forking, which requires cloning databases and runtimes in microseconds without bulk copying.
翻译:智能体探索——让基于大型语言模型(LLM)的智能体能够在众多执行路径中进行分支、回溯和搜索——所需的系统支持远超出当前“k次尝试重置”的范畴。我们对六种快照/恢复机制进行的基准测试表明,即使在隔离的测试环境中,通用工具(如CRIU或容器提交)的速度也不够快;而在实际部署中,当智能体与其他智能体及人类用户共享文件、套接字和云API时,这些机制则完全失效。本次报告中,我们明确了三个尚未解决的基础性挑战:分支语义(涉及分支如何揭示或隐藏暂定更新)、外部副作用(需要为服务添加分支感知或拦截其调用),以及原生分支(要求在微秒级时间内克隆数据库和运行时环境,且无需进行批量复制)。