Detecting vulnerabilities in source code remains a critical yet challenging task, especially when benign and vulnerable functions share significant similarities. In this work, we introduce VulTrial, a courtroom-inspired multi-agent framework designed to identify vulnerable code and to provide explanations. It employs four role-specific agents, which are security researcher, code author, moderator, and review board. Using GPT-4o as the base LLM, VulTrial almost doubles the efficacy of prior best-performing baselines. Additionally, we show that role-specific instruction tuning with small quantities of data significantly further boosts VulTrial's efficacy. Our extensive experiments demonstrate the efficacy of VulTrial across different LLMs, including an open-source, in-house-deployable model (LLaMA-3.1-8B), as well as the high quality of its generated explanations and its ability to uncover multiple confirmed zero-day vulnerabilities in the wild.
翻译:源代码漏洞检测仍是一项关键而具有挑战性的任务,尤其在良性函数与易受攻击函数具有高度相似性时。本研究提出VulTrial,一种受法庭启发的多智能体框架,旨在识别易受攻击代码并提供解释。该框架采用四个角色特定的智能体:安全研究员、代码作者、调解员和评审委员会。以GPT-4o为基础大语言模型,VulTrial的检测效能较先前最优基线提升近一倍。此外,我们证明通过少量数据进行角色特定指令微调可显著提升VulTrial的效能。大量实验表明,VulTrial在不同大语言模型(包括可本地部署的开源模型LLaMA-3.1-8B)上均表现优异,其生成的解释质量高,并具备在真实环境中发现多个已确认零日漏洞的能力。