Pre-trained programming language (PL) models (such as CodeT5, CodeBERT, GraphCodeBERT, etc.,) have the potential to automate software engineering tasks involving code understanding and code generation. However, these models operate in the natural channel of code, i.e., they are primarily concerned with the human understanding of the code. They are not robust to changes in the input and thus, are potentially susceptible to adversarial attacks in the natural channel. We propose, CodeAttack, a simple yet effective black-box attack model that uses code structure to generate effective, efficient, and imperceptible adversarial code samples and demonstrates the vulnerabilities of the state-of-the-art PL models to code-specific adversarial attacks. We evaluate the transferability of CodeAttack on several code-code (translation and repair) and code-NL (summarization) tasks across different programming languages. CodeAttack outperforms state-of-the-art adversarial NLP attack models to achieve the best overall drop in performance while being more efficient, imperceptible, consistent, and fluent. The code can be found at https://github.com/reddy-lab-code-research/CodeAttack.
翻译:预训练编程语言(PL)模型(如CodeT5、CodeBERT、GraphCodeBERT等)有潜力自动化涉及代码理解和代码生成的软件工程任务。然而,这些模型在代码的自然信道中运行,即它们主要关注代码的人类理解。它们对输入的变化不具有鲁棒性,因此可能易受自然信道中的对抗攻击的影响。我们提出了CodeAttack,一种简单而有效的黑盒攻击模型,它使用代码结构生成有效、高效、难以察觉的对抗性代码样本,并展示了最先进的PL模型对代码特定对抗性攻击的漏洞。我们在多种编程语言的代码-代码(翻译和修复)和代码-NL(摘要)任务上评估了CodeAttack的可转移性。CodeAttack优于最先进的对抗性NLP攻击模型,在效率、难以察觉性、一致性和流畅性方面实现了最佳的性能下降。 代码可在https://github.com/reddy-lab-code-research/CodeAttack中找到。