机翻会影响质量，请谅解！效果 (Impact of Code Language Models on Automated Program Repair) - 专知论文

会员服务 ·

0

基准测试 · 代码 · 基准 · 软件可靠性 · 语言模型 ·

2023 年 4 月 16 日

Impact of Code Language Models on Automated Program Repair

翻译：机翻会影响质量，请谅解！效果

Nan Jiang,Kevin Liu,Thibaud Lutellier,Lin Tan

from arxiv, This paper is accepted by 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE)

Automated program repair (APR) aims to help developers improve software reliability by generating patches for buggy programs. Although many code language models (CLM) are developed and effective in many software tasks such as code completion, there has been little comprehensive, in-depth work to evaluate CLMs' fixing capabilities and to fine-tune CLMs for the APR task. Firstly, this work is the first to evaluate ten CLMs on four APR benchmarks, which shows that surprisingly, the best CLM, as is, fixes 72% more bugs than the state-of-the-art deep-learning (DL)-based APR techniques. Secondly, one of the four APR benchmarks was created by us in this paper to avoid data leaking for a fair evaluation. Thirdly, it is the first work to fine-tune CLMs with APR training data, which shows that fine-tuning brings 31%-1,267% improvement to CLMs and enables them to fix 46%-164% more bugs than existing DL-based APR techniques. Fourthly, this work studies the impact of buggy lines, showing that CLMs, as is, cannot make good use of the buggy lines to fix bugs, yet fine-tuned CLMs could potentially over-rely on buggy lines. Lastly, this work analyzes the size, time, and memory efficiency of different CLMs. This work shows promising directions for the APR domain, such as fine-tuning CLMs with APR-specific designs, and also raises awareness of fair and comprehensive evaluations of CLMs and calls for more transparent reporting of open-source repositories used in the pre-training data to address the data leaking problem.

翻译：自动程序修复中代码语言模型的影响自动程序修复（APR）旨在通过为有错误的程序生成补丁，帮助开发人员提高软件可靠性。尽管许多代码语言模型（CLM）在许多软件任务（如代码完成）中已经开发并且有效，但很少有全面深入的工作来评估CLMs的修复能力并调整CLMs以适应APR任务。首先，本文首次评估了四个APR基准测试中的十个CLMs，这表明令人惊讶的是，最好的CLM可以修复比最新的基于深度学习（DL）的APR技术多72%的错误。其次，这四个APR基准测试之一是由我们在本文中创建的，以避免数据泄漏进行公正评估。第三，这是第一篇调整APR训练数据的CLMs的工作，这表明微调可以使CLMs的修补差异提高31％-1,267％，并使其修复比现有的基于DL的APR技术更多46％-164％错误。第四，这项工作研究了缺陷行的影响，显示CLMs本来不能很好地利用缺陷行修复错误，但调整后的CLMs可能会过度依赖缺陷行。最后，这项工作分析了不同CLMs的大小，时间和内存效率。这项工作为APR领域提供了有希望的方向，例如使用APR特定设计调整CLMs，并提高CLMs综合评估的透明度，呼吁更多透明的开源代码库来解决数据泄漏问题。

0

相关内容

基准测试

基准测试是指通过设计科学的测试方法、测试工具和测试系统，实现对一类测试对象的某项性能指标进行定量的和可对比的测试。

【Manning新书】自动机器学习实战，Automated Machine Learning in Action

【Manning新书】自动机器学习实战，Automated Machine Learning in Action

专知会员服务

95+阅读 · 2022年4月8日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

【KDD2020-Tutorial】自动推荐系统，Automated Recommendation System

【KDD2020-Tutorial】自动推荐系统，Automated Recommendation System

专知会员服务

53+阅读 · 2020年8月25日

【微软】利用知识图谱提高抽象摘要的事实正确性，Boosting Factual Correctness

专知会员服务

18+阅读 · 2020年3月23日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

八篇 ICCV 2019 【图神经网络（GNN）+CV】相关论文

八篇 ICCV 2019 【图神经网络（GNN）+CV】相关论文

专知会员服务

30+阅读 · 2020年1月10日

【论文】把人类从学习应用中带出来：自动机器学习综述（Taking the Human out of Learning Applications: A Survey on Automated Machine Learning）

【论文】把人类从学习应用中带出来：自动机器学习综述（Taking the Human out of Learning Applications: A Survey on Automated Machine Learning）

专知会员服务

12+阅读 · 2019年12月20日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

论文浅尝 | Language Models (Mostly) Know What They Know

论文浅尝 | Language Models (Mostly) Know What They Know

开放知识图谱

2+阅读 · 2022年11月18日

“全职做开源 6 个月，我真的不后悔”

“全职做开源 6 个月，我真的不后悔”

CSDN

0+阅读 · 2022年9月21日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

META微软等最新ACL2022教程《非自回归序列生成》，168页ppt

META微软等最新ACL2022教程《非自回归序列生成》，168页ppt

专知

2+阅读 · 2022年6月3日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Rho/ROCK信号通路介导的侵入性死亡（Entosis）在去势抵抗性前列腺癌中的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

MDSCs调控piRNA介导DNA甲基化参与骨髓瘤干细胞形成及耐药的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

SAH在ApoE-/-小鼠动脉粥样硬化形成中的作用机制及甜菜碱干预研究

国家自然科学基金

0+阅读 · 2013年12月31日

多孔POSS/PDMS分子内杂化膜的制备及其渗透汽化优先透醇性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

miR34c重启衰老清除急性髓系白血病干细胞与机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

SCN5A突变(D772N和A1656V)致重叠型室性心律失常机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

含缺陷桩的灌注桩基础竖向承载性状研究

国家自然科学基金

0+阅读 · 2009年12月31日

柔性铜铟镓硒太阳电池异质结的调控及其对光伏性能的影响

国家自然科学基金

0+阅读 · 2009年12月31日

深基坑卸载后的坑底地基与既有工程桩的受力变形性状研究

国家自然科学基金

0+阅读 · 2009年12月31日

Multilingual Conceptual Coverage in Text-to-Image Models

Arxiv

0+阅读 · 2023年6月2日

Generation of Probabilistic Synthetic Data for Serious Games: A Case Study on Cyberbullying

Arxiv

0+阅读 · 2023年6月2日

The Hidden Language of Diffusion Models

Arxiv

0+阅读 · 2023年6月1日

ReFACT: Updating Text-to-Image Models by Editing the Text Encoder

Arxiv

0+阅读 · 2023年6月1日

Better Context Makes Better Code Language Models: A Case Study on Function Call Argument Completion

Arxiv

0+阅读 · 2023年6月1日

Red Teaming Language Model Detectors with Language Models

Arxiv

0+阅读 · 2023年5月31日

A Survey of Knowledge-Enhanced Pre-trained Language Models

Arxiv

18+阅读 · 2022年11月17日

QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering

Arxiv

20+阅读 · 2021年5月27日

Machine Reading Comprehension: The Role of Contextualized Language Models and Beyond

Arxiv

15+阅读 · 2020年5月13日

Taking Human out of Learning Applications: A Survey on Automated Machine Learning

Taking Human out of Learning Applications: A Survey on Automated Machine Learning

Arxiv

14+阅读 · 2019年1月17日

VIP会员

文章信息

相关主题

软件可靠性

相关VIP内容

【Manning新书】自动机器学习实战，Automated Machine Learning in Action

【Manning新书】自动机器学习实战，Automated Machine Learning in Action

专知会员服务

95+阅读 · 2022年4月8日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

【KDD2020-Tutorial】自动推荐系统，Automated Recommendation System

【KDD2020-Tutorial】自动推荐系统，Automated Recommendation System

专知会员服务

53+阅读 · 2020年8月25日

【微软】利用知识图谱提高抽象摘要的事实正确性，Boosting Factual Correctness

专知会员服务

18+阅读 · 2020年3月23日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

八篇 ICCV 2019 【图神经网络（GNN）+CV】相关论文

八篇 ICCV 2019 【图神经网络（GNN）+CV】相关论文

专知会员服务

30+阅读 · 2020年1月10日

【论文】把人类从学习应用中带出来：自动机器学习综述（Taking the Human out of Learning Applications: A Survey on Automated Machine Learning）

【论文】把人类从学习应用中带出来：自动机器学习综述（Taking the Human out of Learning Applications: A Survey on Automated Machine Learning）

专知会员服务

12+阅读 · 2019年12月20日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

论文浅尝 | Language Models (Mostly) Know What They Know

论文浅尝 | Language Models (Mostly) Know What They Know

开放知识图谱

2+阅读 · 2022年11月18日

“全职做开源 6 个月，我真的不后悔”

“全职做开源 6 个月，我真的不后悔”

CSDN

0+阅读 · 2022年9月21日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

META微软等最新ACL2022教程《非自回归序列生成》，168页ppt

META微软等最新ACL2022教程《非自回归序列生成》，168页ppt

专知

2+阅读 · 2022年6月3日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Multilingual Conceptual Coverage in Text-to-Image Models

Arxiv

0+阅读 · 2023年6月2日

Generation of Probabilistic Synthetic Data for Serious Games: A Case Study on Cyberbullying

Arxiv

0+阅读 · 2023年6月2日

The Hidden Language of Diffusion Models

Arxiv

0+阅读 · 2023年6月1日

ReFACT: Updating Text-to-Image Models by Editing the Text Encoder

Arxiv

0+阅读 · 2023年6月1日

Better Context Makes Better Code Language Models: A Case Study on Function Call Argument Completion

Arxiv

0+阅读 · 2023年6月1日

Red Teaming Language Model Detectors with Language Models

Arxiv

0+阅读 · 2023年5月31日

A Survey of Knowledge-Enhanced Pre-trained Language Models

Arxiv

18+阅读 · 2022年11月17日

QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering

Arxiv

20+阅读 · 2021年5月27日

Machine Reading Comprehension: The Role of Contextualized Language Models and Beyond

Arxiv

15+阅读 · 2020年5月13日

Taking Human out of Learning Applications: A Survey on Automated Machine Learning

Taking Human out of Learning Applications: A Survey on Automated Machine Learning

Arxiv

14+阅读 · 2019年1月17日

相关基金

Rho/ROCK信号通路介导的侵入性死亡（Entosis）在去势抵抗性前列腺癌中的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

MDSCs调控piRNA介导DNA甲基化参与骨髓瘤干细胞形成及耐药的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

SAH在ApoE-/-小鼠动脉粥样硬化形成中的作用机制及甜菜碱干预研究

国家自然科学基金

0+阅读 · 2013年12月31日

多孔POSS/PDMS分子内杂化膜的制备及其渗透汽化优先透醇性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

miR34c重启衰老清除急性髓系白血病干细胞与机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

SCN5A突变(D772N和A1656V)致重叠型室性心律失常机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

含缺陷桩的灌注桩基础竖向承载性状研究

国家自然科学基金

0+阅读 · 2009年12月31日

柔性铜铟镓硒太阳电池异质结的调控及其对光伏性能的影响

国家自然科学基金

0+阅读 · 2009年12月31日

深基坑卸载后的坑底地基与既有工程桩的受力变形性状研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员