关于技术债务探测工具之间缺乏共识 (On the Lack of Consensus Among Technical Debt Detection Tools)

A vigorous and growing set of technical debt analysis tools have been developed in recent years -- both research tools and industrial products -- such as Structure 101, SonarQube, and DV8. Each of these tools identifies problematic files using their own definitions and measures. But to what extent do these tools agree with each other in terms of the files that they identify as problematic? If the top-ranked files reported by these tools are largely consistent, then we can be confident in using any of these tools. Otherwise, a problem of accuracy arises. In this paper, we report the results of an empirical study analyzing 10 projects using multiple tools. Our results show that: 1) these tools report very different results even for the most common measures, such as size, complexity, file cycles, and package cycles. 2) These tools also differ dramatically in terms of the set of problematic files they identify, since each implements its own definitions of "problematic". After normalizing by size, the most problematic file sets that the tools identify barely overlap. 3) Our results show that code-based measures, other than size and complexity, do not even moderately correlate with a file's change-proneness or error-proneness. In contrast, co-change-related measures performed better. Our results suggest that, to identify files with true technical debt -- those that experience excessive changes or bugs -- co-change information must be considered. Code-based measures are largely ineffective at pinpointing true debt. Finally, this study reveals the need for the community to create benchmarks and data sets to assess the accuracy of software analysis tools in terms of commonly used measures.

翻译：近些年来,开发了一套强有力和不断增长的技术性债务分析工具 -- -- 包括研究工具和工业产品 -- -- 如结构101、SonarQube和DV8.。这些工具中,每个工具都用自己的定义和计量方法来识别有问题的文件。但是,这些工具在哪些方面彼此一致?如果这些工具报告的排名最靠前的文件基本一致,那么我们就能有信心使用这些工具中的任何一种工具。否则,就会出现一个精确度问题。在本文件中,我们用多种工具来报告分析10个项目的经验性研究的结果。我们的结果显示:(1)这些工具报告的结果非常不同,甚至用最常用的计量方法,如大小、复杂度、文件周期和包周期等。(2) 这些工具在所识别的一组有问题的文档中,在多大程度上彼此一致一致一致?如果这些工具采用自己的“问题”定义,那么我们就可以有信心使用其中最麻烦的文档组合,工具几乎不能重叠。(3)我们的结果显示,除大小和复杂性之外,基于代码的措施与文件的易变度或错误性工具没有多少相关联性。在评估最常见的尺度上,那么,在进行这种精确性分析时,必须用到精确性分析。对比,我们所使用的数据分析时必须用到与精确性分析。。对比性分析结果,用这种精确性分析方法来显示我们所使用的数据。

相关内容

TOOLS

关注 1

这个新版本的工具会议系列恢复了从1989年到2012年的50个会议的传统。工具最初是“面向对象语言和系统的技术”，后来发展到包括软件技术的所有创新方面。今天许多最重要的软件概念都是在这里首次引入的。2019年TOOLS 50+1在俄罗斯喀山附近举行，以同样的创新精神、对所有与软件相关的事物的热情、科学稳健性和行业适用性的结合以及欢迎该领域所有趋势和社区的开放态度，延续了该系列。官网链接：http://tools2019.innopolis.ru/

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日