The most common use of data visualization is to minimize the complexity for proper understanding. A graph is one of the most commonly used representations for understanding relational data. It produces a simplified representation of data that is challenging to comprehend if kept in a textual format. In this study, we propose a methodology to utilize the relational properties of source code in the form of a graph to identify Just-in-Time (JIT) bug prediction in software systems during different revisions of software evolution and maintenance. We presented a method to convert the source codes of commit patches to equivalent graph representations and named it Source Code Graph (SCG). To understand and compare multiple source code graphs, we extracted several structural properties of these graphs, such as the density, number of cycles, nodes, edges, etc. We then utilized the attribute values of those SCGs to visualize and detect buggy software commits. We process more than 246K software commits from 12 subject systems in this investigation. Our investigation on these 12 open-source software projects written in C++ and Java programming languages shows that if we combine the features from SCG with conventional features used in similar studies, we will get the increased performance of Machine Learning (ML) based buggy commit detection models. We also find the increase of F1~Scores in predicting buggy and non-buggy commits statistically significant using the Wilcoxon Signed Rank Test. Since SCG-based feature values represent the style or structural properties of source code updates or changes in the software system, it suggests the importance of careful maintenance of source code style or structure for keeping a software system bug-free.
翻译:数据可视化的最常用用途是尽量降低对正确理解的复杂程度。 图表是用于理解关系数据的最常用表示方式之一。 它生成了简化的数据表达方式, 如果保存在文本格式中, 将难以理解这些数据。 在此研究中, 我们提出一种方法, 以图表的形式使用源代码的关联属性, 以识别软件演化和维护的不同修改过程中软件系统中的 Just- 时间错误预测。 我们提出了一个方法, 将承诺补丁源代码转换为等量的图形表达方式, 并命名为源代码图表。 为了理解和比较多种源代码图表的重要性, 我们提取了这些图表的一些结构属性, 如密度、 周期数量、 节点、 边缘等, 难以理解。 我们然后用这些源代码的属性属性属性来进行视觉化和检测错误软件软件软件。 我们用C++ 和 Java 编程语言对这12个开源的开源软件软件项目进行了调查, 显示如果我们将 SCC 的特性与类似研究中使用的常规特性结合起来, 我们将会获取这些源代码的精确性维护系统 的精确性维护系统 。 自SLML 开始以来, 我们还会 的系统 的特性检测系统 。