少培训,多维修,请:通过零光学习,重新审查自动方案维修 (Less Training, More Repairing Please: Revisiting Automated Program Repair via Zero-shot Learning)

Due to the promising future of Automated Program Repair (APR), researchers have proposed various APR techniques, including heuristic-based, template-based, and constraint-based techniques. Among such classic APR techniques, template-based techniques have been widely recognized as state of the art. However, such template-based techniques require predefined templates to perform repair, and their effectiveness is thus limited. To this end, researchers leveraged the recent advances in Deep Learning to further improve APR. Such learning-based techniques view APR as a Neural Machine Translation problem, using the buggy/fixed code snippets as the source/target languages for translation. In this way, such techniques heavily rely on large numbers of high-quality bug-fixing commits, which can be extremely costly and challenging to construct. Furthermore, the edit variety of these learning-based techniques are limited to the available bug-fixes within their training datasets. Therefore, in this paper, we aim to revisit the learning-based APR problem, and propose AlphaRepair, to leverage zero-shot learning directly using large pre-trained code models for APR. Our main insight is instead of modeling what a repair edit should look like, we can directly predict what the correct code is based on the context information. We have implemented AlphaRepair as a practical multilingual APR tool based on the recent CodeBERT model. Our results on the widely used Defects4J benchmark show that AlphaRepair can substantially outperform state-of-the-art APR tools. We also studied the impact of different design choices and show that AlphaRepair performs even better on a newer version of Defects4J (2.0) with 3.3X more fixes than best performing baseline, indicating that AlphaRepair can potentially avoid the dataset-overfitting issue of existing learning-based techniques.

翻译：由于自动化程序修理(APR)前景光明,研究人员提出了各种RAPR技术,包括基于超光速的、基于模板的和基于约束的技术。在这种经典的RAPR技术中,基于模板的技术被广泛承认为最新技术。然而,这种基于模板的技术需要预先定义的模板来进行修理,因此其效力有限。为此,研究人员利用深度学习的最新进展来进一步改进RA。这些基于学习的技术将RA视为神经机械翻译问题,使用错误/固定代码片段作为翻译的来源/目标语言。在这种方式中,这类技术在很大程度上依赖于大量高质量的错误修正承诺,而这些承诺可能非常昂贵且具有挑战性。此外,这些基于模板的技术的编辑种类仅限于其培训数据集中的可用错误组合。因此,在本文中,我们的目标是重新审视基于学习的ERPRA问题,并提议Alpha Repaarepair, 直接利用我们经过事先训练的大型代码模型来利用零光学学习。我们的主要洞察力是,而不是直接地预测目前已经使用的ARVER标准。我们所使用的标准是,我们目前使用的ARARC基准的模型应该显示一个基于什么更好的工具。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日