《守则十年评论质量评估:系统文献审查》 (A Decade of Code Comment Quality Assessment: A Systematic Literature Review)

Code comments are important artifacts in software systems and play a paramount role in many software engineering (SE) tasks related to maintenance and program comprehension. However, while it is widely accepted that high quality matters in code comments just as it matters in source code, assessing comment quality in practice is still an open problem. First and foremost, there is no unique definition of quality when it comes to evaluating code comments. The few existing studies on this topic rather focus on specific attributes of quality that can be easily quantified and measured. Existing techniques and corresponding tools may also focus on comments bound to a specific programming language, and may only deal with comments with specific scopes and clear goals (e.g., Javadoc comments at the method level, or in-body comments describing TODOs to be addressed). In this paper, we present a Systematic Literature Review (SLR) of the last decade of research in SE to answer the following research questions: (i) What types of comments do researchers focus on when assessing comment quality? (ii) What quality attributes (QAs) do they consider? (iii) Which tools and techniques do they use to assess comment quality?, and (iv) How do they evaluate their studies on comment quality assessment in general? Our evaluation, based on the analysis of 2353 papers and the actual review of 47 relevant ones, shows that (i) most studies and techniques focus on comments in Java code, thus may not be generalizable to other languages, and (ii) the analyzed studies focus on four main QAs of a total of 21 QAs identified in the literature, with a clear predominance of checking consistency between comments and the code. We observe that researchers rely on manual assessment and specific heuristics rather than the automated assessment of the comment quality attributes.

翻译：守则评论是软件系统中的重要文物,在许多与维护和程序理解有关的软件工程(SE)任务中,守则评论是重要文物,在与维护和程序理解有关的许多软件工程(SE)任务中起着极为重要的作用。然而,虽然人们普遍承认守则评论中高质量的问题在源代码中仍然很重要,但评估实践中的评论质量仍然是一个尚未解决的问题。首先,在评价守则评论时,对质量没有独特的定义。关于这一专题的现有研究少有几项,而是侧重于可易于量化和计量的具体质量属性。现有的技术和相应工具也可能侧重于与特定程序语言有关的标准化(SE)任务。现有的技术和工具还可能侧重于与特定程序语言相关的质量属性(例如Javadoc在方法层面的评论中,或描述待处理的TODO的体内评论中的评论),但在本论文中,我们介绍了过去十年中研究的系统文学评论(SLRR),以回答以下研究问题:(一) 研究人员在评估评论质量时注重哪些类型的评论? (二) 我们考虑什么质量属性(QA) 评估使用哪些工具和技术来评估质量质量和明确的目标? (二) 如何在总质量评估中,在评估中评估中,而不是在23项主要评估中评估中评估中评估。