Software comprehension can be extremely time-consuming due to the ever-growing size of codebases. Consequently, there is an increasing need to accelerate the code comprehension process to facilitate maintenance and reduce associated costs. A crucial aspect of this process is understanding and preserving the high quality of the code dependency structure. While a variety of code structure models already exist, there is a surprising lack of models that closely represent the source code and focus on software comprehension. As a result, there are no readily available and easy-to-use tools to assist with dependency comprehension, refactoring, and quality monitoring of code. To address this gap, we propose the Semantic Code Graph (SCG), an information model that offers a detailed abstract representation of code dependencies with a close relationship to the source code. To validate the SCG model's usefulness in software comprehension, we compare it to nine other source code representation models. Additionally, we select 11 well-known and widely-used open-source projects developed in Java and Scala and perform a range of software comprehension activities on them using three different code representation models: the proposed SCG, the Call Graph (CG), and the Class Collaboration Network (CCN). We then qualitatively analyze the results to compare the performance of these models in terms of software comprehension capabilities. These activities encompass project structure comprehension, identifying critical project entities, interactive visualization of code dependencies, and uncovering code similarities through software mining. Our findings demonstrate that the SCG enhances software comprehension capabilities compared to the prevailing CCN and CG models. We believe that the work described is a step towards the next generation of tools that streamline code dependency comprehension and management.
翻译:暂无翻译