Code comments are the primary means to document implementation and facilitate program comprehension. Thus, their quality should be a primary concern to improve program maintenance. While much effort has been dedicated to detecting bad smells, such as clones in code, little work has focused on comments. In this paper we present our solution to detect clones in comments that developers should fix. RepliComment can automatically analyze Java projects and report instances of copy-and-paste errors in comments, and can point developers to which comments should be fixed. Moreover, it can report when clones are signs of poorly written comments. Developers should fix these instances too in order to improve the quality of the code documentation. Our evaluation of 10 well-known open source Java projects identified over 11K instances of comment clones, and over 1,300 of them are potentially critical. We improve on our own previous work, which could only find 36 issues in the same dataset. Our manual inspection of 412 issues reported by RepliComment reveals that it achieves a precision of 79% in reporting critical comment clones. The manual inspection of 200 additional comment clones that RepliComment filters out as being legitimate, could not evince any false negative.
翻译:代码评论是记录执行和帮助理解程序的主要手段。 因此, 代码评论的质量应该是改进程序维护的首要关注事项。 虽然已经花费了大量精力来检测臭气, 比如代码中的克隆, 但没有多少工作集中在评论上。 在本文件中, 我们提出在开发者应该修正的评论中检测克隆的解决方案。 RepliComment 可以自动分析 Java 项目, 并报告评论中复制和粘贴错误的事例, 并且可以点出评论应该被固定的开发者。 此外, 它可以报告当克隆是书面评论不准确的迹象时。 开发者也应该纠正这些情况, 以提高代码文件的质量。 我们对10个众所周知的开放源 Java 项目的评估在11K次评论克隆中被确定, 其中1 300多个项目可能至关重要。 我们改进了我们以前的工作, 只能在同一数据集中找到36个问题。 我们对RepliComment 报告的412问题的手工检查显示, 在报告批评性评论克隆时, 它达到了79%的精确度。 手工检查200个额外的评论, 即 Repli Comment 过滤器为合法, 不能显示任何负面。