Code models, such as CodeBERT and CodeT5, offer general-purpose representations of code and play a vital role in supporting downstream automated software engineering tasks. Most recently, code models were revealed to be vulnerable to backdoor attacks. A code model that is backdoor-attacked can behave normally on clean examples but will produce pre-defined malicious outputs on examples injected with triggers that activate the backdoors. Existing backdoor attacks on code models use unstealthy and easy-to-detect triggers. This paper aims to investigate the vulnerability of code models with stealthy backdoor attacks. To this end, we propose AFRAIDOOR (Adversarial Feature as Adaptive Backdoor). AFRAIDOOR achieves stealthiness by leveraging adversarial perturbations to inject adaptive triggers into different inputs. We evaluate AFRAIDOOR on three widely adopted code models (CodeBERT, PLBART and CodeT5) and two downstream tasks (code summarization and method name prediction). We find that around 85% of adaptive triggers in AFRAIDOOR bypass the detection in the defense process. By contrast, only less than 12% of the triggers from previous work bypass the defense. When the defense method is not applied, both AFRAIDOOR and baselines have almost perfect attack success rates. However, once a defense is applied, the success rates of baselines decrease dramatically to 10.47% and 12.06%, while the success rate of AFRAIDOOR are 77.05% and 92.98% on the two tasks. Our finding exposes security weaknesses in code models under stealthy backdoor attacks and shows that the state-of-the-art defense method cannot provide sufficient protection. We call for more research efforts in understanding security threats to code models and developing more effective countermeasures.
翻译:代码模型, 如 codeBERT 和 CodT5 的代码模型, 如 codBERT 和 CodT5 的代码模型, 提供了通用代码的描述, 并在支持下游自动软件工程任务方面发挥着关键作用。 最近, 代码模型被披露为容易受到幕后攻击。 一个幕后攻击的代码模型可以正常地以干净的示例行事, 但会在注入触发后门的触发器中生成预先定义的恶意输出。 现有的代码模型的幕后攻击使用不易偷听和容易检测的触发器。 本文旨在调查代码模型的弱点, 以隐蔽的后门攻击来进行隐蔽。 为此, 我们提议 AfraIDOOR (Adversarial Featural) 的代码很容易被暴露出来, 而 AFRAI 则通过对抗性扰动性干扰触发不同输入输入的触发器来实现隐蔽。 我们用三个广泛采用的代码模型( CodebBERT, PLBARAT 和 CodeT5) 以及两个下游任务( 代码和方法的逻辑缩后方名预测测测测测测测算) 。 我们发现, 在AFRADODOrbrde 中, 的精确的精确率中, 的精确率中无法用到之前的精确法 10 方法只算算算方法只算方法只算算算算算算法 。