Locating bugs is an important, but effort-intensive and time-consuming task, when dealing with large-scale systems. To address this, Information Retrieval (IR) techniques are increasingly being used to suggest potential buggy source code locations, for given bug reports. While IR techniques are very scalable, in practice their effectiveness in accurately localizing bugs in a software system remains low. Results of empirical studies suggest that the effectiveness of bug localization techniques can be augmented by the configuration of queries used to locate buggy code. However, in most IR-based bug localization techniques, presented by researchers, the impact of the queries' configurations is not fully considered. In a similar vein, techniques consider all code elements as equally suspicious of being buggy while localizing bugs, but this is not always the case either.In this paper, we present a new method-level, information-retrieval-based bug localization technique called ``BoostNSift''. BoostNSift exploits the important information in queries by `boost'ing that information, and then `sift's the identified code elements, based on a novel technique that emphasizes the code elements' specific relatedness to a bug report over its generic relatedness to all bug reports. To evaluate the performance of BoostNSift, we employed a state-of-the-art empirical design that has been commonly used for evaluating file level IR-based bug localization techniques: 6851 bugs are selected from commonly used Eclipse, AspectJ, SWT, and ZXing benchmarks and made openly available for method-level analyses.
翻译:处理大型系统时, 定位错误是一项重要但需要大量精力和费时的任务。 要解决这个问题, 正在越来越多地使用信息回收(IR) 技术来建议潜在的错误源代码位置, 用于给给给的错误报告 。 虽然IR 技术非常可缩放, 实际上它们在软件系统中准确定位错误的有效性仍然很低。 实验研究的结果表明, 错误本地化技术的效用可以通过用于定位错误代码的查询配置来增强。 然而, 在研究人员介绍的多数基于 IR 的错误本地化技术中, 查询配置的影响没有得到充分考虑。 类似地, 技术认为所有代码元素在对错误进行本地化时同样怀疑, 但情况也不总是如此。 在本文中, 我们展示了一个新的方法级别, 信息检索基于错误本地化技术的本地化技术( I) 。 以 Bustnift 为基础, 在查询中, 以“ 启动该信息”, 然后“ 发送错误化” 配置配置配置配置配置配置配置配置配置配置配置配置配置程序的影响没有得到充分考虑。