Software defect prediction is an essential task during the software development Lifecycle as it can help managers to identify the most defect-proneness modules. Thus, it can reduce the test cost and assign testing resources efficiently. Many classification methods can be used to determine if the software is defective or not. Support Vector Machine (SVM) has not been used extensively for such problems because of its instability when applied on different datasets and parameter settings. The main parameter that influences the accuracy is the choice of the kernel function. The use of kernel functions has not been studied thoroughly in previous papers. Therefore, this research examines the performance and accuracy of SVM with six different kernel functions. Various public datasets from the PROMISE project empirically validate our hypothesis. The results demonstrate that no kernel function can give stable performance across different experimental settings. In addition, the use of PCA as a feature reduction algorithm shows slight accuracy improvement over some datasets.
翻译:软件开发过程中,软件缺陷预测是一项基本任务 生命周期,因为它可以帮助管理人员确定最易出故障的模块。 因此,它可以降低测试成本,并高效地分配测试资源。 许多分类方法可以用来确定软件是否有缺陷。 支持矢量机(SVM)在应用不同的数据集和参数设置时不稳定,因此没有被广泛用于这些问题。 影响准确性的主要参数是内核功能的选择。 之前的论文没有彻底研究内核功能的使用。 因此, 本研究用六种不同内核功能来审查SVM的性能和准确性。 PROMISE项目的各种公共数据集以经验方式验证了我们的假设。 结果显示,在不同的实验环境中,没有一个内核功能能够带来稳定的性能。 此外,使用五氯苯作为特性减少算法,表明某些数据集的精度略有提高。