从语言模式自动修理代码 (Automated Repair of Code from Language Models) - 专知论文

会员服务 ·

0

语言模型化 · Automator · MoDELS · TOOLS · 代码 ·

2022 年 12 月 9 日

Automated Repair of Code from Language Models

翻译：从语言模式自动修理代码

Zhiyu Fan,Xiang Gao,Martin Mirchev,Abhik Roychoudhury,Shin Hwei Tan

from arxiv, 12 pages, To appear in ICSE 2023

Large language models such as Codex, have shown the capability to produce code for many programming tasks. However, the success rate of existing models is low, especially for complex programming tasks. One of the reasons is that language models lack awareness of program semantics, resulting in incorrect programs, or even programs which do not compile. In this paper, we systematically study whether automated program repair (APR) techniques can fix the incorrect solutions produced by language models in LeetCode contests. The goal is to study whether APR techniques can enhance reliability in the code produced by large language models. Our study revealed that: (1) automatically generated code shares common programming mistakes with human-crafted solutions, indicating APR techniques may have potential to fix auto-generated code; (2) given bug location information provided by a statistical fault localization approach, the newly released Codex edit mode, which supports editing code, is similar to or better than existing Java repair tools TBar and Recoder in fixing incorrect solutions. By analyzing the experimental results generated by these tools, we provide several suggestions: (1) enhancing APR tools to surpass limitations in patch space (e.g., introducing more flexible fault localization) is desirable; (2) as large language models can derive more fix patterns by training on more data, future APR tools could shift focus from adding more fix patterns to synthesis/semantics based approaches, (3) combination of language models with APR to curate patch ingredients, is worth studying.

翻译：大型语言模型(如 Codex ) 展示了为许多编程任务制作代码的能力。但是,现有模型的成功率较低, 特别是复杂的编程任务。原因之一是语言模型缺乏对程序语义学的认识,导致程序错误,甚至编程不完善。在本文中, 我们系统研究自动程序修理( APR) 技术能否纠正LetCode 竞赛中语言模型产生的错误解决方案。我们的目标是研究RA 技术能否提高大语言模型生成的代码的可靠性。我们的研究显示:(1) 自动生成代码与人造的解决方案共享共同的编程错误, 表明 RA 技术有可能修复自动生成的代码; (2) 统计错误本地化方法提供的错误位置信息, 导致错误位置信息, 导致不正确的程序程序。新发布的代码编辑模式与现有的 Java 修复工具 TBar 和 Recoder 的不正确解决方案相似或更好。我们通过分析这些工具产生的实验结果, 提供了几项建议:(1) 加强RA 工具, 以克服补补补空空间中的局限性( 例如, 引入更灵活的本地化), 表明 APR 系统化方法可能改变大语言模式,, 使大语言模型更侧重于。

0

相关内容

语言模型化

语言模型化

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

microRNA介导Vaspin调控动脉钙化的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

蓖麻矮化相关RcDof基因功能分析及调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Igf2基因内含子miRNA多通路调控心脏衰老的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于多目标权衡控制的混合动力系统DA-ECMS能量管理实时控制研究

国家自然科学基金

0+阅读 · 2013年12月31日

proBDNF通过P75NTR/sortilin受体促进心肌缺血再灌注损伤的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于粘滑原理的球形检测机器人原地转向运动时变滑模控制方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

脊髓细胞特异性miRNAs调控损伤运动神经元凋亡的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

退化k-Hessian方程解的正则性研究

国家自然科学基金

0+阅读 · 2011年12月31日

姜黄素（Curcumin）促进药物洗脱支架植入后再内皮化及其机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

编码密码学中若干组合对象研究

国家自然科学基金

0+阅读 · 2009年12月31日

Controlling Large Language Models to Generate Secure and Vulnerable Code

Arxiv

0+阅读 · 2023年2月10日

The Wisdom of Hindsight Makes Language Models Better Instruction Followers

Arxiv

0+阅读 · 2023年2月10日

Impact of Code Language Models on Automated Program Repair

Arxiv

0+阅读 · 2023年2月10日

Toolformer: Language Models Can Teach Themselves to Use Tools

Arxiv

0+阅读 · 2023年2月9日

On the Applicability of Language Models to Block-Based Programs

Arxiv

0+阅读 · 2023年2月8日

Toward a Theory of Causation for Interpreting Neural Code Models

Arxiv

0+阅读 · 2023年2月7日

Generating High-Precision Feedback for Programming Syntax Errors using Large Language Models

Arxiv

0+阅读 · 2023年1月24日

Knowledge Graph Embedding: A Survey from the Perspective of Representation Spaces

Arxiv

18+阅读 · 2022年11月7日

AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Arxiv

23+阅读 · 2021年8月12日

Advances and Challenges in Conversational Recommender Systems: A Survey

Arxiv

14+阅读 · 2021年1月23日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Controlling Large Language Models to Generate Secure and Vulnerable Code

Arxiv

0+阅读 · 2023年2月10日

The Wisdom of Hindsight Makes Language Models Better Instruction Followers

Arxiv

0+阅读 · 2023年2月10日

Impact of Code Language Models on Automated Program Repair

Arxiv

0+阅读 · 2023年2月10日

Toolformer: Language Models Can Teach Themselves to Use Tools

Arxiv

0+阅读 · 2023年2月9日

On the Applicability of Language Models to Block-Based Programs

Arxiv

0+阅读 · 2023年2月8日

Toward a Theory of Causation for Interpreting Neural Code Models

Arxiv

0+阅读 · 2023年2月7日

Generating High-Precision Feedback for Programming Syntax Errors using Large Language Models

Arxiv

0+阅读 · 2023年1月24日

Knowledge Graph Embedding: A Survey from the Perspective of Representation Spaces

Arxiv

18+阅读 · 2022年11月7日

AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Arxiv

23+阅读 · 2021年8月12日

Advances and Challenges in Conversational Recommender Systems: A Survey

Arxiv

14+阅读 · 2021年1月23日

相关基金

microRNA介导Vaspin调控动脉钙化的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

蓖麻矮化相关RcDof基因功能分析及调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Igf2基因内含子miRNA多通路调控心脏衰老的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于多目标权衡控制的混合动力系统DA-ECMS能量管理实时控制研究

国家自然科学基金

0+阅读 · 2013年12月31日

proBDNF通过P75NTR/sortilin受体促进心肌缺血再灌注损伤的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于粘滑原理的球形检测机器人原地转向运动时变滑模控制方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

脊髓细胞特异性miRNAs调控损伤运动神经元凋亡的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

退化k-Hessian方程解的正则性研究

国家自然科学基金

0+阅读 · 2011年12月31日

姜黄素（Curcumin）促进药物洗脱支架植入后再内皮化及其机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

编码密码学中若干组合对象研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员