Transformers have gained popularity in the software engineering (SE) literature. These deep learning models are usually pre-trained through a self-supervised objective, meant to provide the model with basic knowledge about a language of interest (e.g., Java). A classic pre-training objective is the masked language model (MLM), in which a percentage of tokens from the input (e.g., a Java method) is masked, with the model in charge of predicting them. Once pre-trained, the model is then fine-tuned to support the specific downstream task of interest (e.g., code summarization). While there is evidence suggesting the boost in performance provided by pre-training, little is known about the impact of the specific pre-training objective(s) used. Indeed, MLM is just one of the possible pre-training objectives and recent work from the natural language processing field suggest that pre-training objectives tailored for the specific downstream task of interest may substantially boost the model's performance. In this study, we focus on the impact of pre-training objectives on the performance of transformers when automating code-related tasks. We start with a systematic literature review aimed at identifying the pre-training objectives used in SE. Then, we pre-train 32 transformers using both (i) generic pre-training objectives usually adopted in SE; and (ii) pre-training objectives tailored to specific code-related tasks subject of our experimentation, namely bug-fixing, code summarization, and code completion. We also compare the pre-trained models with non pre-trained ones. Our results show that: (i) pre-training helps in boosting performance only if the amount of fine-tuning data available is small; (ii) the MLM objective is usually sufficient to maximize the prediction performance of the model, even when comparing it with pre-training objectives specialized for the downstream task at hand.
翻译:这些深层次的学习模式通常通过自我监督的目标(如Java)来向模型提供一种感兴趣的语言(如Java)的基本知识。一个典型的培训前目标是隐蔽语言模式(MLM),在这种模式中,输入(如Java方法)的象征物有一定比例被遮盖,而模型则负责预测。这些深层次的学习模式一旦经过培训,然后经过微调,以支持具体的下游感兴趣的模式(如代码合成 ) 。虽然有证据表明培训前语言语言语言(如Java)提供了基本知识。一个典型的训练前目标是隐蔽语言模式(MLMM ) 。 在这种模式中,输入输入一个特定下游兴趣任务(如Java方法) 的表示,培训前目标可能会大大提升模型的性能。在这项研究中,我们把培训前目标对变异型者的表现进行比较,在进行小节流化前,通常使用SEE-B的目标进行系统化审查。