通过培训前语文模式加强自动补全校正的预测 (Boosting Automated Patch Correctness Prediction via Pre-trained Language Model)

Automated program repair (APR) aims to fix software bugs automatically without human debugging efforts and plays a crucial role in software development and maintenance. Despite the recent significant progress, APR is still challenged by a long-standing overfitting problem (i.e., the generated patch is plausible but overfitting). Various techniques have thus been proposed to address the overfitting problem. Among them, leveraging deep learning approaches to predict patch correctness is emerging along with the available large-scale patch benchmarks recently. However, existing learning-based techniques mainly rely on manually-designed code features, which can be extremely costly and challenging to construct in practice. In this paper, we propose APPT, a pre-trained model-based automated patch correctness assessment technique, which treats the source code as token sequences without extra overhead to design hand-crafted features. In particular, APPT adopts a pre-trained model as the encoder stack, followed by an LSTM stack and a deep learning classifier. Although our idea is general and can be built on various pre-trained models, we implemente APPT based on the BERT model. We conduct an extensive experiment on 1,183 Defects4J patches and the results show that APPT achieves prediction accuracy of 79.0% and recall of 81.3%, outperforming the state-of-the-art technique CACHE by 3.6% and 4.8%. Our additional investigation on 49,694 real-world patches shows that APPT achieves the optimum performance (exceeding 99% in five common metrics for assessing patch classification techniques) compared with existing representation learning techniques. We also prove that adopting code pre-trained models can further provide substantial advancement (e.g., GraphCodeBERT-based APPT improves BERT-based APPT by 3.0% and 2.6% in precision and recall, respectively), highlighting the generalizability of APPT.

翻译：自动化程序修补(APR)旨在自动修正软件错误,而无需人为调试努力,并在软件开发和维护方面发挥关键作用。尽管最近取得了显著进展,但非洲复兴局仍面临长期的超装问题(即生成的补丁是可信的,但过于装配 ) 。因此提出了各种技术来解决过配问题。其中,利用深层次学习方法预测补丁正确性,以及最近的大规模补补补丁基准正在出现。然而,现有的学习技术主要依靠人工设计的代码功能,而这种功能可能非常昂贵,而且难以在实践中构建。在本文件中,我们提议采用预先训练的模型自动化补补补补补补补技术,这是预先训练的模型的自动补补补补补补技术,它把源代码作为代号序列处理,而无需额外的间接设计手工艺特性。特别是,应用一个预培训模型来预测补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补补

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日