Large Transformer models achieved the state-of-the-art status for Natural Language Understanding tasks and are increasingly becoming the baseline model architecture for modeling source code. Transformers are usually pre-trained on large unsupervised corpora, learning token representations and transformations relevant to modeling generally available text, and are then fine-tuned on a particular downstream task of interest. While fine-tuning is a tried-and-true method for adapting a model to a new domain -- for example, question-answering on a given topic -- generalization remains an on-going challenge. In this paper, we explore and evaluate transformer model fine-tuning for personalization. In the context of generating unit tests for Java methods, we evaluate learning to personalize to a specific software project using several personalization techniques. We consider three key approaches: (i) custom fine-tuning, which allows all the model parameters to be tuned; (ii) lightweight fine-tuning, which freezes most of the model's parameters, allowing tuning of the token embeddings and softmax layer only or the final layer alone; (iii) prefix tuning, which keeps model parameters frozen, but optimizes a small project-specific prefix vector. Each of these techniques offers a trade-off in total compute cost and predictive performance, which we evaluate by code and task-specific metrics, training time, and total computational operations. We compare these fine-tuning strategies for code generation and discuss the potential generalization and cost benefits of each in various deployment scenarios.
翻译:大型变异模型达到了自然语言理解任务的最新水平,并日益成为建模源代码的基线模型架构。变异模型通常在大型不受监督的子公司、学习与一般可用文本建模相关的象征性表示和转换方面接受预先培训,然后对特定下游感兴趣的任务进行微调。微调是一种将模型适应新领域(例如,对特定主题的问答)的试算方法,一般化仍然是一项持续的挑战。在本文件中,我们探索和评价变异模型为个性化的微调模型。在为爪哇方法制作单位测试时,我们利用几种个性化技术,对学习将特定软件项目个性化进行学习,然后对一个特定版本进行微调。我们考虑三种关键方法:(一) 定制微调,使所有模型参数都能够调整;(二) 轻量微微微微调整,将模型参数的大部分部分冻结,只允许对符号嵌入和软模层进行调,或仅对最后层进行微调。(三) 在为整个部署方法进行单位测试时,我们用特定的计算总成本和精确的计算,每个模型的计算成本预测,每个模型的计算,我们通过这些总成本预测,这些总成本的计算,每个模型的计算,我们通过总成本的计算,然后对总成本的计算,对总的计算。