Software dependency network metrics extracted from the dependency graph of the software modules by the application of Social Network Analysis (SNA metrics) have been shown to improve the performance of the Software Defect prediction (SDP) models. However, the relative effectiveness of these SNA metrics over code metrics in improving the performance of the SDP models has been widely debated with no clear consensus. Furthermore, some of the common SDP scenarios like predicting the number of defects in a module (Defect-count) in Cross-version and Cross-project SDP contexts remain unexplored. Such lack of clear directive on the effectiveness of SNA metrics when compared to the widely used code metrics prevents us from potentially building better performing SDP models. Therefore, through a case study of 9 open source software projects across 30 versions, we study the relative effectiveness of SNA metrics when compared to code metrics across 3 commonly used SDP contexts (Within-project, Cross-version and Cross-project) and scenarios (Defect-count, Defect-classification (classifying if a module is defective) and Effort-aware (ranking the defective modules w.r.t to the involved effort)). We find the SNA metrics by themselves or along with code metrics improve the performance of SDP models over just using code metrics on 5 out of the 9 studied SDP scenarios (three SDP scenarios across three SDP contexts). However, we note that in some cases the improvements afforded by considering SNA metrics over or alongside code metrics might only be marginal, whereas in other cases the improvements could be potentially large. Based on these findings we suggest that the future work should: consider SNA metrics alongside code metrics in their SDP models; as well as consider Ego metrics and Global metrics, the two different types of the SNA metrics separately when training SDP models as they behave differently.
翻译:应用社会网络分析(SNA指标)从软件模块的依赖性图中提取的软件依赖性网络的衡量标准,通过应用社会网络分析(SNA指标),从软件模块的依赖性图中提取的软件依赖性网络指标,显示这些SDP指标相对于软件缺陷预测值的相对效力提高了软件缺陷预测模型的性能,然而,这些SDP模型在改进SDP模型绩效方面比代码指标的相对效力已经广泛辩论,但没有明确的共识。此外,一些共同的SDP假设情景,例如预测跨版本和跨项目SDP背景下模块(缺陷计算)中的缺陷数量(缺陷计算)和跨项目 SDP背景下的缺陷。 与广泛使用的代码相比,SDP模型中缺少关于SDP模型有效性的明确指令,我们发现SDP模型的改进值,而SDP模型中的错误模型与SDP模型相比,在SDP模型中,SDP模型本身的运行情况可能与SDP模型相比。