Contextual information plays a vital role for software developers when understanding and fixing a bug. Context can also be important in deep learning-based program repair to provide extra information about the bug and its fix. Existing techniques, however, treat context in an arbitrary manner, by extracting code in close proximity of the buggy statement within the enclosing file, class, or method, without any analysis to find actual relations with the bug. To reduce noise, they use a predefined maximum limit on the number of tokens to be used as context. We present a program slicing-based approach, in which instead of arbitrarily including code as context, we analyze statements that have a control or data dependency on the buggy statement. We propose a novel concept called dual slicing, which leverages the context of both buggy and fixed versions of the code to capture relevant repair ingredients. We present our technique and tool called Katana, the first to apply slicing-based context for a program repair task. The results show Katana effectively preserves sufficient information for a model to choose contextual information while reducing noise. We compare against four recent state-of-the-art context-aware program repair techniques. Our results show Katana fixes between 1.5 to 3.7 times more bugs than existing techniques.
翻译:在理解和修复错误时,背景信息对软件开发者具有关键作用。背景信息在理解和修复错误时,对于软件开发者来说,对于软件开发者来说具有关键作用。在深层次的基于学习的程序修复中,对于提供关于错误及其修正的额外信息也很重要。但是,现有的技术,通过在所附文件、分类或方法中提取与错误语句相近的代码,在不作任何分析以找到与错误的实际关系的情况下,任意地处理上下文。为了减少噪音,他们使用预设的最大限,对用作上下文的标牌数使用预先定义的最大限值。我们提出了一个基于程序剪切除法的方法,其中,而不是任意地将代码作为上下文,我们分析了对错误语句具有控制或数据依赖性的语句。我们提出了一个称为双重剪切的新概念,即利用错误语和固定版本的代码来捕捉相关的修理成分。我们展示了我们的技术和工具,称为Katana,首先在程序修理任务中应用基于剪贴语的语系。结果显示卡塔纳有效地保存了足够的信息,用于选择背景信息的模式,同时减少噪音。我们比较了最近四个状态的、背景、背景观测程序技术,而不是现有3.7。我们的成果显示,我们的数据显示了比现有的3.7。