Many Duplicate Bug Report Detection (DBRD) techniques have been proposed in the research literature. The industry uses some other techniques. Unfortunately, there is insufficient comparison among them, and it is unclear how far we have been. This work fills this gap by comparing the aforementioned techniques. To compare them, we first need a benchmark that can estimate how a tool would perform if applied in a realistic setting today. Thus, we first investigated potential biases that affect the fair comparison of the accuracy of DBRD techniques. Our experiments suggest that data age and issue tracking system choice cause a significant difference. Based on these findings, we prepared a new benchmark. We then used it to evaluate DBRD techniques to estimate better how far we have been. Surprisingly, a simpler technique outperforms recently proposed sophisticated techniques on most projects in our benchmark. In addition, we compared the DBRD techniques proposed in research with those used in Mozilla and VSCode. Surprisingly, we observe that a simple technique already adopted in practice can achieve comparable results as a recently proposed research tool. Our study gives reflections on the current state of DBRD, and we share our insights to benefit future DBRD research.
翻译:研究文献中已经提出了许多重复的错误报告探测技术(DBRD) 。 工业界使用了一些其他技术。 不幸的是, 它们之间没有进行充分的比较, 并且我们还不清楚我们所使用技术的距离。 这项工作通过比较上述技术填补了这一差距。 为了比较这些技术, 我们首先需要有一个基准, 可以估计工具在现实环境中应用到今天如何运作。 因此, 我们首先调查了影响对DBRD技术准确性进行公平比较的潜在偏差。 我们的实验表明, 数据年龄和问题追踪系统的选择造成了很大的差异。 根据这些发现, 我们准备了一个新的基准。 我们随后用它来评估DBRD技术, 以更好地估计我们所达到的程度。 令人惊讶的是, 一种较简单的技术比我们最近提出的有关我们基准中大多数项目的尖端技术效果要好。 此外, 我们比较了研究中提议的DBRD技术与Mozilla和VSCode所使用的技术。 令人惊讶的是, 我们发现, 在实践中已经采用的一种简单技术可以作为最近提议的研究工具取得可比的结果。 我们的研究对DBRD的现状进行了反思, 我们分享我们关于未来研究的见解。