保障AI代理执行安全 (Securing AI Agent Execution)

Large Language Models (LLMs) have evolved into AI agents that interact with external tools and environments to perform complex tasks. The Model Context Protocol (MCP) has become the de facto standard for connecting agents with such resources, but security has lagged behind: thousands of MCP servers execute with unrestricted access to host systems, creating a broad attack surface. In this paper, we introduce AgentBound, the first access control framework for MCP servers. AgentBound combines a declarative policy mechanism, inspired by the Android permission model, with a policy enforcement engine that contains malicious behavior without requiring MCP server modifications. We build a dataset containing the 296 most popular MCP servers, and show that access control policies can be generated automatically from source code with 80.9% accuracy. We also show that AgentBound blocks the majority of security threats in several malicious MCP servers, and that policy enforcement engine introduces negligible overhead. Our contributions provide developers and project managers with a practical foundation for securing MCP servers while maintaining productivity, enabling researchers and tool builders to explore new directions for declarative access control and MCP security.

翻译：大型语言模型（LLMs）已发展为能够与外部工具及环境交互以执行复杂任务的AI代理。模型上下文协议（MCP）已成为连接代理与这类资源的事实标准，但其安全性发展滞后：数以千计的MCP服务器在拥有对主机系统无限制访问权限的情况下执行，形成了广泛的攻击面。本文提出了首个面向MCP服务器的访问控制框架AgentBound。该框架结合了受Android权限模型启发的声明式策略机制与策略执行引擎，无需修改MCP服务器即可遏制恶意行为。我们构建了包含296个最流行MCP服务器的数据集，并证明访问控制策略可从源代码自动生成，准确率达80.9%。实验表明，AgentBound能有效拦截多个恶意MCP服务器中的主要安全威胁，且策略执行引擎引入的开销可忽略不计。我们的工作为开发者和项目经理提供了保障MCP服务器安全且保持生产效率的实践基础，使研究者和工具构建者能够探索声明式访问控制与MCP安全的新方向。