Context: Cross-project defect prediction (CPDP) models are being developed to optimize the testing resources. Objectives: Proposing an ensemble classification framework for CPDP as many existing models are lacking with better performances and analysing the main objectives of CPDP from the outcomes of the proposed classification framework. Method: For the classification task, we propose a bootstrap aggregation based hybrid-inducer ensemble learning (HIEL) technique that uses probabilistic weighted majority voting (PWMV) strategy. To know the impact of HIEL on the software project, we propose three project-specific performance measures such as percent of perfect cleans (PPC), percent of non-perfect cleans (PNPC), and false omission rate (FOR) from the predictions to calculate the amount of saved cost, remaining service time, and percent of the failures in the target project. Results: On many target projects from PROMISE, NASA, and AEEEM repositories, the proposed model outperformed recent works such as TDS, TCA+, HYDRA, TPTL, and CODEP in terms of F-measure. In terms of AUC, the TCA+ and HYDRA models stand as strong competitors to the HIEL model. Conclusion: For better predictions, we recommend ensemble learning approaches for the CPDP models. And, to estimate the benefits from the CPDP models, we recommend the above project-specific performance measures.
翻译:为优化测试资源,正在开发跨项目缺陷预测(CPDP)模型,以优化测试资源。目标:由于许多现有模型缺乏更好的性能,建议为CPCPP制定一个混合分类框架,从拟议分类框架的结果中分析CPP的主要目标。 方法:在分类任务方面,我们提议采用基于混合-教育者混合联合学习(HIEL)的篮子综合技术,采用概率加权多数投票(PWMV)战略。为了了解HIEL对软件项目的影响,我们建议了三种具体项目的绩效措施,例如:完美清洁剂(PPPC)的百分比、非完美清洁剂(PPC)的百分比、以及预测中的虚假遗漏率(FOR),以计算节省的成本、剩余服务时间以及目标项目失败率的百分率。结果:关于PROMISE、美国航天局和AEEEM储存库的许多目标项目,拟议的模型超过了诸如TDS、TCA+、HYDRA、TPL和CODEP等近期工作模型,建议了F-MAS的可靠绩效评估模型。