Automating code review with Large Language Models (LLMs) shows immense promise, yet practical adoption is hampered by their lack of reliability, context-awareness, and control. To address this, we propose Specification-Grounded Code Review (SGCR), a framework that grounds LLMs in human-authored specifications to produce trustworthy and relevant feedback. SGCR features a novel dual-pathway architecture: an explicit path ensures deterministic compliance with predefined rules derived from these specifications, while an implicit path heuristically discovers and verifies issues beyond those rules. Deployed in a live industrial environment at HiThink Research, SGCR's suggestions achieved a 42% developer adoption rate-a 90.9% relative improvement over a baseline LLM (22%). Our work demonstrates that specification-grounding is a powerful paradigm for bridging the gap between the generative power of LLMs and the rigorous reliability demands of software engineering.
翻译:利用大语言模型(LLMs)自动化代码审查展现出巨大潜力,但其可靠性不足、缺乏上下文感知以及可控性差等问题阻碍了实际应用。为解决这些问题,我们提出了基于规范的代码审查框架SGCR,该框架将LLMs基于人工编写的规范之上,以产生可信且相关的反馈。SGCR采用了一种新颖的双路径架构:显式路径确保确定性遵循从这些规范推导出的预定义规则,而隐式路径则启发式地发现并验证超出这些规则的问题。在HiThink Research的实际工业环境中部署后,SGCR的建议实现了42%的开发者采纳率——相对于基线LLM(22%)有90.9%的相对提升。我们的工作表明,基于规范是一种强大的范式,能够弥合LLMs的生成能力与软件工程对严格可靠性的要求之间的差距。