Pre-trained code generation models (PCGMs) have been widely applied in neural code generation which can generate executable code from functional descriptions in natural languages, possibly together with signatures. Despite substantial performance improvement of PCGMs, the role of method names in neural code generation has not been thoroughly investigated. In this paper, we study and demonstrate the potential of benefiting from method names to enhance the performance of PCGMs, from a model robustness perspective. Specifically, we propose a novel approach, named RADAR (neuRAl coDe generAtor Robustifier). RADAR consists of two components: RADAR-Attack and RADAR-Defense. The former attacks a PCGM by generating adversarial method names as part of the input, which are semantic and visual similar to the original input, but may trick the PCGM to generate completely unrelated code snippets. As a countermeasure to such attacks, RADAR-Defense synthesizes a new method name from the functional description and supplies it to the PCGM. Evaluation results show that RADAR-Attack can, e.g., reduce the CodeBLEU of generated code by 19.72% to 38.74% in three state-of-the-art PCGMs (i.e., CodeGPT, PLBART, and CodeT5). Moreover, RADAR-Defense is able to reinstate the performance of PCGMs with synthesized method names. These results highlight the importance of good method names in neural code generation and implicate the benefits of studying model robustness in software engineering.
翻译:预先训练的代码生成模型(PCGM)被广泛应用于神经代码生成中,该模型可以产生自然语言功能描述的可执行代码,可能还有签名。尽管PCGM的性能有了很大的改进,但尚未彻底调查方法名称在神经代码生成中的作用。在本文中,我们从模型稳健的角度研究和展示从方法名称获益的潜力,以提高PCGM的性能。具体地说,我们提议一种新颖的方法,名为RADAR(Nebral co de gritor Robustor) 。RADAR由两个组成部分组成:RADAR-Attack和RADAR-Defence。前PCGM通过生成对抗方法的名称作为输入的一部分,这些名称与原始输入相近,但可能诱骗PCGMM产生完全无关的代码片断。作为对付这类攻击的一种对策,RADAR-PT 模型合成了一个新的方法名称,从功能描述和向PCGMM提供这种名称。评价结果显示RADAR-AR-AD-AD-AD-Dedeal-deal Redudeal-deal Redudeal-deal-deal-deal-deal-deal-I-deal-deal-deal-de-deal-deal-deal-deal-I.e-deal-deal-de-de-de-de-de-dex-de-de-de-de-de-de-de-de-de-deceutututil-dex-dex-dexxxx-dex-de,在州代码中,e-dex-dex-dex-dex-de-de-de-dex-de-dex-dex-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-i-de-de-de-de-de-deutut thex-de-I-de-de-de-de-i-dex-i-i-e-i-i-de-de-de-i-i-i-i-i-i-i-i-i-i-de-e-de-de-de-de-de-i