不再有罚款吗? 对代码情报中快速发价的实验性评价。 (No More Fine-Tuning? An Experimental Evaluation of Prompt Tuning in Code Intelligence)

Pre-trained models have been shown effective in many code intelligence tasks. These models are pre-trained on large-scale unlabeled corpus and then fine-tuned in downstream tasks. However, as the inputs to pre-training and downstream tasks are in different forms, it is hard to fully explore the knowledge of pre-trained models. Besides, the performance of fine-tuning strongly relies on the amount of downstream data, while in practice, the scenarios with scarce data are common. Recent studies in the natural language processing (NLP) field show that prompt tuning, a new paradigm for tuning, alleviates the above issues and achieves promising results in various NLP tasks. In prompt tuning, the prompts inserted during tuning provide task-specific knowledge, which is especially beneficial for tasks with relatively scarce data. In this paper, we empirically evaluate the usage and effect of prompt tuning in code intelligence tasks. We conduct prompt tuning on popular pre-trained models CodeBERT and CodeT5 and experiment with three code intelligence tasks including defect prediction, code summarization, and code translation. Our experimental results show that prompt tuning consistently outperforms fine-tuning in all three tasks. In addition, prompt tuning shows great potential in low-resource scenarios, e.g., improving the BLEU scores of fine-tuning by more than 26\% on average for code summarization. Our results suggest that instead of fine-tuning, we could adapt prompt tuning for code intelligence tasks to achieve better performance, especially when lacking task-specific data.

翻译：在许多代码情报任务中,经过预先培训的模型被证明在许多代码情报任务中是有效的。这些模型在大规模无标签的模型上经过预先培训,然后在下游任务中进行微调。然而,由于对培训前和下游任务的投入形式不同,很难充分探索培训前和下游任务的知识。此外,微调的绩效在很大程度上依赖于下游数据的数量,而在实践中,数据稀少的假设情景是常见的。在自然语言处理(NLP)领域最近进行的研究表明,迅速调适、新的调适模式、缓解上述问题并在国家语言方案的各项任务中取得有希望的成果。在快速调适中,在调适中插入的提示提供了具体任务的知识,这对于相对较少的数据任务特别有益。在本文中,我们从经验上评价了代码情报任务迅速调适量的使用情况和效果。我们迅速调整了受欢迎的模型DCBERT和DCT5, 并试验了三项代码情报任务,包括缺陷预测、代码总化和代码翻译。我们的实验结果表明,在调低级任务中,我们更精确地调整了精确地改进了电子调整,特别是超值调整了欧盟标准,所有三个任务。在改进了我们的标准任务中,可以显示,改进了所有三个任务中,改进了比改进了比改进了整个任务中,改进了比改进了比改进了比改进了比改进了欧盟平均任务的进度的进度。