Software and systems traceability is essential for downstream tasks such as data-driven software analysis and intelligent tool development. However, despite the increasing attention to mining and understanding technical debt in software systems, specific tools for supporting the track of technical debts are rarely available. In this work, we propose the first programming language-independent tracking tool for self-admitted technical debt (SATD) -- a sub-optimal solution that is explicitly annotated by developers in software systems. Our approach takes a git repository as input and returns a list of SATDs with their evolution actions (created, deleted, updated) at the commit-level. Our approach also returns a line number indicating the latest starting position of the corresponding SATD in the system. Our SATD tracking approach first identifies an initial set of raw SATDs (which only have created and deleted actions) by detecting and tracking SATDs in commits' hunks, leveraging a state-of-the-art language-independent SATD detection approach. Then it calculates a context-based matching score between pairs of deleted and created raw SATDs in the same commits to identify SATD update actions. The results of our preliminary study on Apache Tomcat and Apache Ant show that our tracking tool can achieve a F1 score of 92.8% and 96.7% respectively.
翻译:软件和系统的可追溯性对于下游任务(例如数据驱动的软件分析和智能工具开发)至关重要。然而,尽管越来越多的关注被投入到挖掘和了解软件系统中的技术债务,但特定于支持技术债务追踪的工具很少可用。在本文中,我们提出了第一个编程语言无关的自认技术债务(SATD)追踪工具——一种由开发者在软件系统中明确标注的次优解决方案。我们的方法将git存储库作为输入,并返回SATD的列表,以及它们在提交级别上的演变操作(创建、删除、更新)。我们的方法还返回一个行号,表示对应SATD在系统中的最新起始位置。我们的SATD追踪方法首先通过在提交的代码片段中检测和追踪SATD,利用最先进的语言无关SATD检测方法,识别一个初始的原始SATD集合(只有创建和删除操作)。然后,在同一提交中计算被删除和创建的原始SATD对之间的基于上下文的匹配得分,以识别SATD的更新操作。我们在Apache Tomcat和Apache Ant上进行的初步研究结果表明,我们的追踪工具可以分别达到92.8%和96.7%的F1分数。