The increasing complexity of today's software requires the contribution of thousands of developers. This complex collaboration structure makes developers more likely to introduce defect-prone changes that lead to software faults. Determining when these defect-prone changes are introduced has proven challenging, and using traditional machine learning (ML) methods to make these determinations seems to have reached a plateau. In this work, we build contribution graphs consisting of developers and source files to capture the nuanced complexity of changes required to build software. By leveraging these contribution graphs, our research shows the potential of using graph-based ML to improve Just-In-Time (JIT) defect prediction. We hypothesize that features extracted from the contribution graphs may be better predictors of defect-prone changes than intrinsic features derived from software characteristics. We corroborate our hypothesis using graph-based ML for classifying edges that represent defect-prone changes. This new framing of the JIT defect prediction problem leads to remarkably better results. We test our approach on 14 open-source projects and show that our best model can predict whether or not a code change will lead to a defect with an F1 score as high as 77.55% and a Matthews correlation coefficient (MCC) as high as 53.16%. This represents a 152% higher F1 score and a 3% higher MCC over the state-of-the-art JIT defect prediction. We describe limitations, open challenges, and how this method can be used for operational JIT defect prediction.
翻译:随着软件的日益复杂化,需要成千上万的开发人员作出贡献。这种复杂的协作结构使开发人员更容易引入缺陷导致软件故障。确定何时引入这些缺陷成为了具有挑战性的问题,使用传统机器学习方法进行判断似乎已经达到了瓶颈。在这项工作中,我们构建了包含开发人员和源文件的贡献图以捕捉建立软件所需的复杂变化的细微复杂程度。通过利用这些贡献图,我们的研究显示了利用基于图形的机器学习来改善即时缺陷预测的潜力。我们假设从贡献图中提取的特征可能比从软件特征中提取的内在特征更能预测有缺陷的变更。我们借助基于图形的机器学习来分类代表缺陷-导致变化的边缘,从而证实了我们的假设。这种新的即时缺陷预测问题框架导致了明显更好的结果。我们在14个开源项目上测试了我们的方法,并显示我们的最佳模型可以预测代码更改是否会导致缺陷,F1分数高达77.55%,Matthews相关系数(MCC)高达53.16%。这代表比最先进的俯瞰式缺陷预测高152%的F1分数和3%的MCC。我们描述了局限性、面临的挑战以及如何将此方法用于实际的即时缺陷预测。