检索增强代码生成中检索器后门的安全威胁探究 (Exploring the Security Threats of Retriever Backdoors in Retrieval-Augmented Code Generation)

Retrieval-Augmented Code Generation (RACG) is increasingly adopted to enhance Large Language Models for software development, yet its security implications remain dangerously underexplored. This paper conducts the first systematic exploration of a critical and stealthy threat: backdoor attacks targeting the retriever component, which represents a significant supply-chain vulnerability. It is infeasible to assess this threat realistically, as existing attack methods are either too ineffective to pose a real danger or are easily detected by state-of-the-art defense mechanisms spanning both latent-space analysis and token-level inspection, which achieve consistently high detection rates. To overcome this barrier and enable a realistic analysis, we first developed VenomRACG, a new class of potent and stealthy attack that serves as a vehicle for our investigation. Its design makes poisoned samples statistically indistinguishable from benign code, allowing the attack to consistently maintain low detectability across all evaluated defense mechanisms. Armed with this capability, our exploration reveals a severe vulnerability: by injecting vulnerable code equivalent to only 0.05% of the entire knowledge base size, an attacker can successfully manipulate the backdoored retriever to rank the vulnerable code in its top-5 results in 51.29% of cases. This translates to severe downstream harm, causing models like GPT-4o to generate vulnerable code in over 40% of targeted scenarios, while leaving the system's general performance intact. Our findings establish that retriever backdooring is not a theoretical concern but a practical threat to the software development ecosystem that current defenses are blind to, highlighting the urgent need for robust security measures.

翻译：检索增强代码生成（RACG）在增强大型语言模型用于软件开发方面日益普及，但其安全影响仍存在危险性的探索不足。本文首次系统性地探究了一种关键且隐蔽的威胁：针对检索器组件的后门攻击，这代表了一种重大的供应链漏洞。由于现有攻击方法要么效果太差无法构成实际威胁，要么容易被涵盖潜在空间分析和令牌级检查的最先进防御机制所检测（这些机制始终保持着高检测率），因此现实评估此威胁并不可行。为突破此障碍并实现现实分析，我们首先开发了VenomRACG，这是一种新型强效且隐蔽的攻击方法，作为我们研究的载体。其设计使得中毒样本在统计上与良性代码无法区分，从而使该攻击在所有评估的防御机制中始终保持低可检测性。借助此能力，我们的探索揭示了一个严重的漏洞：通过注入仅相当于整个知识库大小0.05%的易受攻击代码，攻击者可以成功操纵被植入后门的检索器，在51.29%的情况下将易受攻击代码排在其前5位结果中。这导致了严重的下游危害，使得像GPT-4o这样的模型在超过40%的目标场景中生成了易受攻击的代码，同时系统的整体性能保持不变。我们的研究结果表明，检索器后门化并非理论上的担忧，而是对软件开发生态系统的实际威胁，而当前防御机制对此视而不见，这凸显了采取强健安全措施的迫切需求。