Bug localization is a tedious activity in the bug fixing process in which a software developer tries to locate bugs in the source code described in a bug report. Since this process is time-consuming and requires additional knowledge about the software project, information retrieval techniques can aid the bug localization process. In this paper, we investigate if normal text search engines can improve existing bug localization approaches. In a case study, we evaluate the performance of our search engine approach Broccoli against seven state-of-the-art bug localization algorithms on 82 open source projects in two data sets. Our results show that including a search engine can increase the performance of the bug localization and that it is a useful extension to existing approaches. As part of our analysis we also exposed a flaw in a commonly used benchmark strategy, i.e., that files of a single release are considered. To increase the number of detectable files, we mitigate this flaw by considering the state of the software repository at the time of the bug report. Our results show that using single releases may lead to an underestimation of the the prediction performance.
翻译:错误本地化是错误修正过程中一个无聊的活动, 软件开发者试图在错误报告中描述的源代码中定位错误。 由于此过程耗时且需要更多有关软件工程的知识, 信息检索技术可以帮助错误本地化进程。 在本文中, 我们调查普通文本搜索引擎是否能改进现有的错误本地化方法。 在一项案例研究中, 我们用两个数据集来对照82个开放源项目中的7个最先进的错误本地化算法来评估我们的搜索引擎方法Broccoli的性能。 我们的结果表明, 包含一个搜索引擎可以提高错误本地化的性能, 并且它是现有方法的有用延伸。 作为我们分析的一部分, 我们还暴露了一个常用的基准战略中的缺陷, 即, 即, 考虑单释放的文件。 为了增加可检测文件的数量, 我们通过在错误报告时考虑软件存储库的状况来减轻这一缺陷。 我们的结果表明, 使用单发版本可能会导致低估预测性能的不足 。