Just-in-time defect prediction assigns a defect risk to each new change to a software repository in order to prioritize review and testing efforts. Over the last decades different approaches were proposed in literature to craft more accurate prediction models. However, defect prediction is still not widely used in industry, due to predictions with varying performance. In this study, we evaluate existing features on six open-source projects and propose two new features sets, not yet discussed in literature. By combining all feature sets, we improve MCC by on average 21%, leading to the best performing models when compared to state-of-the-art approaches. We also evaluate effort-awareness and find that on average 14% more defects can be identified, inspecting 20% of changed lines.
翻译:刚到时的缺陷预测给软件库的每个新变化都设定了缺陷风险,以便确定审查和测试工作的优先顺序。在过去几十年里,文献中提出了不同的方法来设计更准确的预测模型。然而,由于预测的绩效不同,在行业中仍然没有广泛使用缺陷预测。在本研究中,我们评估了六个开放源码项目的现有特征,并提出了两套新的特征,文献中尚未讨论过。我们通过将所有特征组合结合起来,平均将管理协委会改进21%,从而形成与最新技术方法相比最优秀的模型。我们还评估了工作意识,发现平均可以发现14 % 的缺陷,检查了20%的改变行。