更大并非总是更好：利用结构化代码差异进行注释不一致性检测 (Larger Is Not Always Better: Leveraging Structured Code Diffs for Comment Inconsistency Detection)

Ensuring semantic consistency between source code and its accompanying comments is crucial for program comprehension, effective debugging, and long-term maintainability. Comment inconsistency arises when developers modify code but neglect to update the corresponding comments, potentially misleading future maintainers and introducing errors. Recent approaches to code-comment inconsistency (CCI) detection leverage Large Language Models (LLMs) and rely on capturing the semantic relationship between code changes and outdated comments. However, they often ignore the structural complexity of code evolution, including historical change activities, and introduce privacy and resource challenges. In this paper, we propose a Just-In-Time CCI detection approach built upon the CodeT5+ backbone. Our method decomposes code changes into ordered sequences of modification activities such as replacing, deleting, and adding to more effectively capture the correlation between these changes and the corresponding outdated comments. Extensive experiments conducted on publicly available benchmark datasets-JITDATA and CCIBENCH--demonstrate that our proposed approach outperforms recent state-of-the-art models by up to 13.54% in F1-Score and achieves an improvement ranging from 4.18% to 10.94% over fine-tuned LLMs including DeepSeek-Coder, CodeLlama and Qwen2.5-Coder.

翻译：确保源代码与其伴随注释之间的语义一致性对于程序理解、有效调试和长期可维护性至关重要。当开发者修改代码但忽略更新相应注释时，便会产生注释不一致，这可能误导未来的维护者并引入错误。近期的代码-注释不一致性检测方法利用大型语言模型，并依赖于捕捉代码变更与过时注释之间的语义关系。然而，这些方法通常忽略了代码演进的结构复杂性（包括历史变更活动），并引入了隐私和资源挑战。本文提出了一种基于CodeT5+主干网络的即时CCI检测方法。我们的方法将代码变更分解为替换、删除和添加等修改活动的有序序列，以更有效地捕捉这些变更与相应过时注释之间的关联。在公开基准数据集JITDATA和CCIBENCH上进行的大量实验表明，我们提出的方法在F1分数上优于近期最先进模型高达13.54%，并且相较于包括DeepSeek-Coder、CodeLlama和Qwen2.5-Coder在内的微调LLMs，实现了4.18%至10.94%的性能提升。