Automated program repair is already deployed in industry, but concerns remain about repair quality. Recent research has shown that one of the main reasons repair tools produce incorrect (but seemingly correct) patches is imperfect fault localization (FL). This paper demonstrates that combining information from natural-language bug reports and test executions when localizing faults can have a significant positive impact on repair quality. For example, existing repair tools with such FL are able to correctly repair 7 defects in the Defects4J benchmark that no prior tools have repaired correctly. We develop, Blues, the first information-retrieval-based, statement-level FL technique that requires no training data. We further develop RAFL, the first unsupervised method for combining multiple FL techniques, which outperforms a supervised method. Using RAFL, we create SBIR by combining Blues with a spectrum-based (SBFL) technique. Evaluated on 815 real-world defects, SBIR consistently ranks buggy statements higher than its underlying techniques. We then modify three state-of-the-art repair tools, Arja, SequenceR, and SimFix, to use SBIR, SBFL, and Blues as their internal FL. We evaluate the quality of the produced patches on 689 real-world defects. Arja and SequenceR significantly benefit from SBIR: Arja using SBIR correctly repairs 28 defects, but only 21 using SBFL, and only 15 using Blues; SequenceR using SBIR correctly repairs 12 defects, but only 10 using SBFL, and only 4 using Blues. SimFix, (which has internal mechanisms to overcome poor FL), correctly repairs 30 defects using SBIR and SBFL, but only 13 using Blues. Our work is the first investigation of simultaneously using multiple software artifacts for automated program repair, and our promising findings suggest future research in this directions is likely to be fruitful.
翻译:自动程序修理已经在行业中部署,但对维修质量仍有关切。最近的研究表明,主要原因之一是修复工具产生不正确(但看似正确)的错误地方化(FL ) 。 本文表明,将来自自然语言错误报告的信息与本地化故障时测试处决相结合,可以对修理质量产生显著的积极影响。 例如,现有修复工具在FL能够正确修复Defects4J基准中的7个缺陷,而此前没有正确修复任何工具。 我们开发了, Blues, 首次基于信息检索的、 声明级FL 技术,不需要培训数据。 我们进一步开发了RAFL, 这是将多种FL技术组合起来的第一个不受监督的方法, 超越了监管方法。 使用RAFL, 我们创建了SBIR, 将Blues与基于频谱的(SBFFLF) 技术相结合。 评估了815个真实世界的缺陷, SBIR总是将错误声明比其基础技术高得多。 我们随后修改了3个状态的FR, Arja, SecerceR R和SimFix, 使用Sral Real的S-R imL 。