SGCR：一种基于规范的可信大语言模型代码审查框架 (SGCR: A Specification-Grounded Framework for Trustworthy LLM Code Review) - 专知论文

会员服务 ·

0

代码 · 路径 · 语言模型 · 大语言模型 · 上下文感知 ·

SGCR: A Specification-Grounded Framework for Trustworthy LLM Code Review

翻译：SGCR：一种基于规范的可信大语言模型代码审查框架

Kai Wang,Bingcheng Mao,Shuai Jia,Yujie Ding,Dongming Han,Tianyi Ma,Bin Cao

Automating code review with Large Language Models (LLMs) shows immense promise, yet practical adoption is hampered by their lack of reliability, context-awareness, and control. To address this, we propose Specification-Grounded Code Review (SGCR), a framework that grounds LLMs in human-authored specifications to produce trustworthy and relevant feedback. SGCR features a novel dual-pathway architecture: an explicit path ensures deterministic compliance with predefined rules derived from these specifications, while an implicit path heuristically discovers and verifies issues beyond those rules. Deployed in a live industrial environment at HiThink Research, SGCR's suggestions achieved a 42% developer adoption rate-a 90.9% relative improvement over a baseline LLM (22%). Our work demonstrates that specification-grounding is a powerful paradigm for bridging the gap between the generative power of LLMs and the rigorous reliability demands of software engineering.

翻译：利用大语言模型（LLMs）自动化代码审查展现出巨大潜力，但其可靠性不足、缺乏上下文感知以及可控性差等问题阻碍了实际应用。为解决这些问题，我们提出了基于规范的代码审查框架SGCR，该框架将LLMs基于人工编写的规范之上，以产生可信且相关的反馈。SGCR采用了一种新颖的双路径架构：显式路径确保确定性遵循从这些规范推导出的预定义规则，而隐式路径则启发式地发现并验证超出这些规则的问题。在HiThink Research的实际工业环境中部署后，SGCR的建议实现了42%的开发者采纳率——相对于基线LLM（22%）有90.9%的相对提升。我们的工作表明，基于规范是一种强大的范式，能够弥合LLMs的生成能力与软件工程对严格可靠性的要求之间的差距。

0

相关内容

代码（Code）是专知网的一个重要知识资料文档板块，旨在整理收录论文源代码、复现代码，经典工程代码等，便于用户查阅下载使用。

【NeurIPS2024】TableRAG：基于语言模型的百万标记表格理解

【NeurIPS2024】TableRAG：基于语言模型的百万标记表格理解

专知会员服务

37+阅读 · 2024年10月8日

【CIKM2023】GiGaMAE: 通过协同潜在空间重建的可泛化图掩码自编码器

【CIKM2023】GiGaMAE: 通过协同潜在空间重建的可泛化图掩码自编码器

专知会员服务

23+阅读 · 2023年8月22日

《用于代码弱点识别的 LLVM 中间表示》CMU

《用于代码弱点识别的 LLVM 中间表示》CMU

专知会员服务

14+阅读 · 2022年12月12日

【ACL2020-CMU-Google】MobileBERT:用于资源受限设备的任务无关“瘦版”BERT

【ACL2020-CMU-Google】MobileBERT:用于资源受限设备的任务无关“瘦版”BERT

专知会员服务

13+阅读 · 2020年4月9日

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

专知会员服务

21+阅读 · 2020年3月28日

最新最全《深度元学习》2021综述论文，68页pdf，A Survey of Deep Meta-Learning

最新最全《深度元学习》2021综述论文，68页pdf，A Survey of Deep Meta-Learning

专知

11+阅读 · 2021年4月23日

Python图像处理，366页pdf，Image Operators Image Processing in Python

Python图像处理，366页pdf，Image Operators Image Processing in Python

专知

15+阅读 · 2020年7月23日

【NeurIPS2019】图变换网络：Graph Transformer Network

【NeurIPS2019】图变换网络：Graph Transformer Network

专知

245+阅读 · 2019年11月18日

Facebook开源MUSE：多语言无监督和监督词向量库

Facebook开源MUSE：多语言无监督和监督词向量库

论智

20+阅读 · 2017年12月23日

RNN | RNN实践指南（2）

RNN | RNN实践指南（2）

KingsGarden

19+阅读 · 2017年5月4日

语义Web知识库补全关键技术研究

国家自然科学基金

17+阅读 · 2017年12月31日

基于Spark的大图数据最优子模式匹配查询方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

SDN数据平面中大规模流表的高性能查找方法研究

国家自然科学基金

4+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

SHVC质量可伸缩视频编码的快速算法研究

国家自然科学基金

1+阅读 · 2014年12月31日

SweRank+: Multilingual, Multi-Turn Code Ranking for Software Issue Localization

Arxiv

0+阅读 · 12月23日

Fewer Hallucinations, More Verification: A Three-Stage LLM-Based Framework for ASR Error Correction

Arxiv

0+阅读 · 12月23日

VisionDirector: Vision-Language Guided Closed-Loop Refinement for Generative Image Synthesis

Arxiv

0+阅读 · 12月22日

UCoder: Unsupervised Code Generation by Internal Probing of Large Language Models

Arxiv

0+阅读 · 12月19日

LLM4Perf: Large Language Models Are Effective Samplers for Multi-Objective Performance Modeling

Arxiv

0+阅读 · 12月19日

VIP会员

文章信息

相关主题

大语言模型

上下文感知

相关VIP内容

【NeurIPS2024】TableRAG：基于语言模型的百万标记表格理解

【NeurIPS2024】TableRAG：基于语言模型的百万标记表格理解

专知会员服务

37+阅读 · 2024年10月8日

【CIKM2023】GiGaMAE: 通过协同潜在空间重建的可泛化图掩码自编码器

【CIKM2023】GiGaMAE: 通过协同潜在空间重建的可泛化图掩码自编码器

专知会员服务

23+阅读 · 2023年8月22日

《用于代码弱点识别的 LLVM 中间表示》CMU

《用于代码弱点识别的 LLVM 中间表示》CMU

专知会员服务

14+阅读 · 2022年12月12日

【ACL2020-CMU-Google】MobileBERT:用于资源受限设备的任务无关“瘦版”BERT

【ACL2020-CMU-Google】MobileBERT:用于资源受限设备的任务无关“瘦版”BERT

专知会员服务

13+阅读 · 2020年4月9日

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

专知会员服务

21+阅读 · 2020年3月28日

热门VIP内容

开通专知VIP会员享更多权益服务

【书籍】从零开始构建文本生成图像生成器：基于 Transformers 与扩散模型

人工智能与未来指挥

【伯克利博士论文】将大语言模型绑定至虚拟人格：实现人类行为模拟

稀疏自编码器综述：解释大语言模型的内部机制

相关资讯

最新最全《深度元学习》2021综述论文，68页pdf，A Survey of Deep Meta-Learning

最新最全《深度元学习》2021综述论文，68页pdf，A Survey of Deep Meta-Learning

专知

11+阅读 · 2021年4月23日

Python图像处理，366页pdf，Image Operators Image Processing in Python

Python图像处理，366页pdf，Image Operators Image Processing in Python

专知

15+阅读 · 2020年7月23日

【NeurIPS2019】图变换网络：Graph Transformer Network

【NeurIPS2019】图变换网络：Graph Transformer Network

专知

245+阅读 · 2019年11月18日

Facebook开源MUSE：多语言无监督和监督词向量库

Facebook开源MUSE：多语言无监督和监督词向量库

论智

20+阅读 · 2017年12月23日

RNN | RNN实践指南（2）

RNN | RNN实践指南（2）

KingsGarden

19+阅读 · 2017年5月4日

相关论文

SweRank+: Multilingual, Multi-Turn Code Ranking for Software Issue Localization

Arxiv

0+阅读 · 12月23日

Fewer Hallucinations, More Verification: A Three-Stage LLM-Based Framework for ASR Error Correction

Arxiv

0+阅读 · 12月23日

VisionDirector: Vision-Language Guided Closed-Loop Refinement for Generative Image Synthesis

Arxiv

0+阅读 · 12月22日

UCoder: Unsupervised Code Generation by Internal Probing of Large Language Models

Arxiv

0+阅读 · 12月19日

LLM4Perf: Large Language Models Are Effective Samplers for Multi-Objective Performance Modeling

Arxiv

0+阅读 · 12月19日

相关基金

语义Web知识库补全关键技术研究

国家自然科学基金

17+阅读 · 2017年12月31日

基于Spark的大图数据最优子模式匹配查询方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

SDN数据平面中大规模流表的高性能查找方法研究

国家自然科学基金

4+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

SHVC质量可伸缩视频编码的快速算法研究

国家自然科学基金

1+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员