Code review is a practice widely adopted in open source and industrial projects. Given the non-negligible cost of such a process, researchers started investigating the possibility of automating specific code review tasks. We recently proposed Deep Learning (DL) models targeting the automation of two tasks: the first model takes as input a code submitted for review and implements in it changes likely to be recommended by a reviewer; the second takes as input the submitted code and a reviewer comment posted in natural language and automatically implements the change required by the reviewer. While the preliminary results we achieved are encouraging, both models had been tested in rather simple code review scenarios, substantially simplifying the targeted problem. This was also due to the choices we made when designing both the technique and the experiments. In this paper, we build on top of that work by demonstrating that a pre-trained Text-To-Text Transfer Transformer (T5) model can outperform previous DL models for automating code review tasks. Also, we conducted our experiments on a larger and more realistic (and challenging) dataset of code review activities.
翻译:代码审查是在开放源码和工业项目中广泛采用的一种做法。鉴于这一程序的成本不可忽略,研究人员开始调查将具体代码审查任务自动化的可能性。我们最近提出了针对两项任务自动化的深学习模式:第一个模式将提交审查的代码作为投入,并在其中实施可能由审查者建议的变更;第二个模式将所提交的代码和以自然语言张贴的审评员评论作为投入,并自动执行审查员要求的变更。虽然我们所取得的初步结果令人鼓舞,但两个模式都已经在相当简单的代码审查假设中进行了测试,大大简化了目标问题。这也是因为我们在设计技术和实验时所作的选择。在本文中,我们除了这项工作之外,还展示了事先经过培训的文本到文本转换器(T5)模型能够超越以前的代码自动化审查任务的DL模型。此外,我们还在更大和更现实(和富有挑战性的)代码审查活动的数据集上进行了实验。