论指令微调本地大语言模型在识别软件漏洞方面的有效性 (On the Effectiveness of Instruction-Tuning Local LLMs for Identifying Software Vulnerabilities)

Large Language Models (LLMs) show significant promise in automating software vulnerability analysis, a critical task given the impact of security failure of modern software systems. However, current approaches in using LLMs to automate vulnerability analysis mostly rely on using online API-based LLM services, requiring the user to disclose the source code in development. Moreover, they predominantly frame the task as a binary classification(vulnerable or not vulnerable), limiting potential practical utility. This paper addresses these limitations by reformulating the problem as Software Vulnerability Identification (SVI), where LLMs are asked to output the type of weakness in Common Weakness Enumeration (CWE) IDs rather than simply indicating the presence or absence of a vulnerability. We also tackle the reliance on large, API-based LLMs by demonstrating that instruction-tuning smaller, locally deployable LLMs can achieve superior identification performance. In our analysis, instruct-tuning a local LLM showed better overall performance and cost trade-off than online API-based LLMs. Our findings indicate that instruct-tuned local models represent a more effective, secure, and practical approach for leveraging LLMs in real-world vulnerability management workflows.

翻译：大语言模型（LLMs）在自动化软件漏洞分析方面展现出巨大潜力，鉴于现代软件系统安全失效的影响，这是一项关键任务。然而，当前利用LLMs自动化漏洞分析的方法主要依赖于使用基于在线API的LLM服务，这要求用户披露开发中的源代码。此外，这些方法大多将任务框定为二元分类（存在漏洞或不存在漏洞），限制了其潜在的实际效用。本文通过将问题重新定义为软件漏洞识别（SVI）来解决这些局限性，即要求LLMs输出通用缺陷枚举（CWE）ID中的弱点类型，而非简单地指示漏洞是否存在。我们还通过证明对较小的、可本地部署的LLMs进行指令微调能够实现更优的识别性能，从而解决了对大型、基于API的LLMs的依赖问题。在我们的分析中，对本地LLM进行指令微调显示出比基于在线API的LLMs更好的整体性能与成本权衡。我们的研究结果表明，指令微调的本地模型为在实际漏洞管理工作流中利用LLMs提供了一种更有效、更安全且更实用的方法。