Machine Learning for Software Engineering (ML4SE) is an actively growing research area that focuses on methods that help programmers in their work. In order to apply the developed methods in practice, they need to achieve reasonable quality in order to help rather than distract developers. While the development of new approaches to code representation and data collection improves the overall quality of the models, it does not take into account the information that we can get from the project at hand. In this work, we investigate how the model's quality can be improved if we target a specific project. We develop a framework to assess quality improvements that models can get after fine-tuning for the method name prediction task on a particular project. We evaluate three models of different complexity and compare their quality in three settings: trained on a large dataset of Java projects, further fine-tuned on the data from a particular project, and trained from scratch on this data. We show that per-project fine-tuning can greatly improve the models' quality as they capture the project's domain and naming conventions. We open-source the tool we used for data collection, as well as the code to run the experiments: https://zenodo.org/record/6040745.
翻译:软件工程机器学习(ML4SE)是一个正在积极增长的研究领域,重点是帮助程序设计员开展工作的方法。为了在实践中应用开发的方法,他们需要达到合理的质量,以便帮助而不是分散开发者。虽然制定新的代码代表和数据收集方法可以提高模型的总体质量,但是它没有考虑到我们从手头项目中获得的信息。在这项工作中,我们调查如果我们针对一个特定项目,模型的质量如何可以提高。我们开发了一个框架,以评估模型在对某一特定项目的方法名称预测任务进行微调后能够取得的质量改进。我们评估了三种不同复杂程度的模型,并在三个环境中比较了它们的质量:对爪哇项目大型数据集进行了培训,对特定项目的数据进行了进一步微调,并从零头对数据进行了培训。我们表明,每个项目的微调可以极大地提高模型的质量,因为它们能够捕捉到项目的域和命名公约。我们打开了用于数据收集的工具,以及进行实验的代码是:https://zenodo.org/record7440/605.。