CICRCLE: 跨方案语言的持续维修 (CIRCLE: Continual Repair across Programming Languages)

Automatic Program Repair (APR) aims at fixing buggy source code with less manual debugging efforts, which plays a vital role in improving software reliability and development productivity. Recent APR works have achieved remarkable progress via applying deep learning (DL), particularly neural machine translation (NMT) techniques. However, we observe that existing DL-based APR models suffer from at least two severe drawbacks: (1) Most of them can only generate patches for a single programming language, as a result, to repair multiple languages, we have to build and train many repairing models. (2) Most of them are developed in an offline manner. Therefore, they won't function when there are new-coming requirements. To address the above problems, a T5-based APR framework equipped with continual learning ability across multiple programming languages is proposed, namely \emph{C}ont\emph{I}nual \emph{R}epair a\emph{C}ross Programming \emph{L}anguag\emph{E}s (\emph{CIRCLE}). Specifically, (1) CIRCLE utilizes a prompting function to narrow the gap between natural language processing (NLP) pre-trained tasks and APR. (2) CIRCLE adopts a difficulty-based rehearsal strategy to achieve lifelong learning for APR without access to the full historical data. (3) An elastic regularization method is employed to strengthen CIRCLE's continual learning ability further, preventing it from catastrophic forgetting. (4) CIRCLE applies a simple but effective re-repairing method to revise generated errors caused by crossing multiple programming languages. We train CIRCLE for four languages (i.e., C, JAVA, JavaScript, and Python) and evaluate it on five commonly used benchmarks. The experimental results demonstrate that CIRCLE not only effectively and efficiently repairs multiple programming languages in continual learning settings, but also achieves state-of-the-art performance with a single repair model.

翻译：自动程序修理( APR) 旨在用较少人工调试的努力来修补错误源码, 这在提高软件的可靠性和发展生产率方面起着至关重要的作用。最近的 PRRA 工作通过应用深层学习( DL), 特别是神经机翻译( NMT) 技术, 取得了显著的进展。然而, 我们观察到, 基于 DL 的 PRA 模式至少有两个严重的缺陷:(1) 多数模式只能为单一编程语言生成补丁, 从而修复多种语言。 (2) 多数模式是以离线方式开发的。因此, 当有新的需求时, 它们将无法发挥功能。为了解决上述问题, 提议采用基于 T5 的 RA 框架, 配备了多种程序翻译能力, 即 emph{C} { I}, 以 empleph{C} 语言生成补补补补补补补补补补补, 使用 IMLELELER IMLE IMLA 。 (creal) 具体地, 使用 CREARC IMLIL IML IML IML IML IMLE IMLE, 和 IML IMLE IMLE IML IML IM IML IM IM IM IM 。 ( IP IP IP IP IP IP IP IP IP IP IP IP IP IP IP IP IP IP IP IP IP IP IP IP IP IP IP IP IP IP IP IP IP IP 。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日