Automated Program Repair (APR) improves software reliability by generating patches for a buggy program automatically. Recent APR techniques leverage deep learning (DL) to build models to learn to generate patches from existing patches and code corpora. While promising, DL-based APR techniques suffer from the abundant syntactically or semantically incorrect patches in the patch space. These patches often disobey the syntactic and semantic domain knowledge of source code and thus cannot be the correct patches to fix a bug. We propose a DL-based APR approach KNOD, which incorporates domain knowledge to guide patch generation in a direct and comprehensive way. KNOD has two major novelties, including (1) a novel three-stage tree decoder, which directly generates Abstract Syntax Trees of patched code according to the inherent tree structure, and (2) a novel domain-rule distillation, which leverages syntactic and semantic rules and teacher-student distributions to explicitly inject the domain knowledge into the decoding procedure during both the training and inference phases. We evaluate KNOD on three widely-used benchmarks. KNOD fixes 72 bugs on the Defects4J v1.2, 25 bugs on the QuixBugs, and 50 bugs on the additional Defects4J v2.0 benchmarks, outperforming all existing APR tools.
翻译:自动程序修复(APR)通过自动生成补丁修复程序中存在的故障自动来提高软件可靠性。最近的 APR 技术利用深度学习(DL)构建模型,从现有的补丁和代码库中学习生成补丁。尽管有前途,DL-Based 的 APR 技术仍受制于补丁空间中大量的语法或语义不正确的补丁。这些补丁往往违反了源代码的语法和语义领域知识,因此不能成为修复 bug 的正确补丁。我们提出了一种 DL-Based 的 APR 方法 KNOD,它将领域知识直接而全面地引导到补丁生成中。KNOD 有两个主要的创新点,包括(1)一种新颖的三级树解码器,根据内在的树结构直接生成已修补代码的抽象语法树,以及(2)一种新颖的领域规则提取技术,利用语法和语义规则和教师 - 学生分布,将领域知识显式注入到训练和推理过程中的解码过程中。我们在三个广泛使用的基准测试上评估了 KNOD。KNOD 修复了 Defects4Jv1.2 上的 72 个 bug,QuixBugs 上的 25 个 bug,以及其他 Defects4Jv2.0 基准测试上的 50 个 bug,优于所有现有的 APR 工具。