少培训,多维修,请:通过零光学习,重新审查自动方案维修 (Less Training, More Repairing Please: Revisiting Automated Program Repair via Zero-shot Learning)

Due to the promising future of Automated Program Repair (APR), researchers have proposed various APR techniques, including heuristic-based, template-based, and constraint-based techniques. Among such classic APR techniques, template-based techniques have been widely recognized as state of the art. However, such template-based techniques require predefined templates to perform repair, and their effectiveness is thus limited. To this end, researchers leveraged the recent advances in Deep Learning to further improve APR. Such learning-based techniques view APR as a Neural Machine Translation problem, using the buggy/fixed code snippets as the source/target languages for translation. In this way, such techniques heavily rely on large numbers of high-quality bug-fixing commits, which can be extremely costly and challenging to construct. Furthermore, the edit variety of these learning-based techniques are limited to the available bug-fixes within their training datasets. Therefore, in this paper, we aim to revisit the learning-based APR problem, and propose AlphaRepair, to leverage zero-shot learning directly using large pre-trained code models for APR. Our main insight is instead of modeling what a repair edit should look like, we can directly predict what the correct code is based on the context information. We have implemented AlphaRepair as a practical multilingual APR tool based on the recent CodeBERT model. Our results on the widely used Defects4J benchmark show that AlphaRepair can substantially outperform state-of-the-art APR tools. We also studied the impact of different design choices and show that AlphaRepair performs even better on a newer version of Defects4J (2.0) with 3.3X more fixes than best performing baseline, indicating that AlphaRepair can potentially avoid the dataset-overfitting issue of existing learning-based techniques.

翻译：由于自动化程序修理(APR)前景光明,研究人员提出了各种RAPR技术,包括基于超光速的、基于模板的和基于约束的技术。在这种经典的RAPR技术中,基于模板的技术被广泛承认为最新技术。然而,这种基于模板的技术需要预先定义的模板来进行修理,因此其效力有限。为此,研究人员利用深度学习的最新进展来进一步改进RA。这些基于学习的技术将RA视为神经机械翻译问题,使用错误/固定代码片段作为翻译的来源/目标语言。在这种方式中,这类技术在很大程度上依赖于大量高质量的错误修正承诺,而这些承诺可能非常昂贵且具有挑战性。此外,这些基于模板的技术的编辑种类仅限于其培训数据集中的可用错误组合。因此,在本文中,我们的目标是重新审视基于学习的ERPRA问题,并提议Alpha Repaarepair, 直接利用我们经过事先训练的大型代码模型来利用零光学学习。我们的主要洞察力是,而不是直接地预测目前已经使用的ARVER标准。我们所使用的标准是,我们目前使用的ARARC基准的模型应该显示一个基于什么更好的工具。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日