CURE: 自动程序维修代号软件神经机翻译 (CURE: Code-Aware Neural Machine Translation for Automatic Program Repair)

Automatic program repair (APR) is crucial to improve software reliability. Recently, neural machine translation (NMT) techniques have been used to fix software bugs automatically. While promising, these approaches have two major limitations. Their search space often does not contain the correct fix, and their search strategy ignores software knowledge such as strict code syntax. Due to these limitations, existing NMT-based techniques underperform the best template-based approaches. We propose CURE, a new NMT-based APR technique with three major novelties. First, CURE pre-trains a programming language (PL) model on a large software codebase to learn developer-like source code before the APR task. Second, CURE designs a new code-aware search strategy that finds more correct fixes by focusing on compilable patches and patches that are close in length to the buggy code. Finally, CURE uses a subword tokenization technique to generate a smaller search space that contains more correct fixes. Our evaluation on two widely-used benchmarks shows that CURE correctly fixes 57 Defects4J bugs and 26 QuixBugs bugs, outperforming all existing APR techniques on both benchmarks.

翻译：自动程序修理( APR) 是提高软件可靠性的关键。最近, 神经机翻译( NMT) 技术被自动地用于修复软件错误。虽然有希望, 但这些方法有两个主要的局限性。它们的搜索空间通常不包含正确的修正, 它们的搜索策略忽略了软件知识, 如严格的代码语法。由于这些限制, 现有的 NMT 技术低于基于最佳模板的方法。我们提议CURE, 基于 NMT 的新型 NPRA 技术, 有三大新颖之处。首先, CURE 预对软件代码库的编程语言( PL) 模型进行测试, 以在 RA 任务之前学习开发者相似的源代码。其次, CURE 设计新的代码识别搜索策略, 以可兼容的补丁码和补丁与错误代码相近的方式找到更正确的修正。最后, CURE 使用子词代号代号技术来生成一个更小的搜索空间, 其中含有更正确的修正。我们对两个广泛使用的基准的评估显示 CURE 正确修正了 57 Defectsyts4J 和 QUBUBugs 。

相关内容

Machine Translation

关注 210

机器翻译（Machine Translation）涵盖计算语言学和语言工程的所有分支，包含多语言方面。特色论文涵盖理论，描述或计算方面的任何下列主题:双语和多语语料库的编写和使用，计算机辅助语言教学，非罗马字符集的计算含义，连接主义翻译方法，对比语言学等。官网地址：http://dblp.uni-trier.de/db/journals/mt/

【Facebook AI】无监督机器翻译，336页ppt，Unsupervised Machine Translation

专知会员服务

19+阅读 · 2020年11月17日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【伯克利】黑盒机器翻译系统的模仿攻击与防御，Imitation Attacks and Defenses for Black-box Machine Translation Systems

专知会员服务

7+阅读 · 2020年5月4日

多语言神经机器翻译综述论文，34页pdf，A Comprehensive Survey of Multilingual Neural Machine Translation

专知会员服务

19+阅读 · 2020年4月25日