基于后翻译的神经元修复模型改进 (MUFIN: Improving Neural Repair Models with Back-Translation) - 专知论文

会员服务 ·

0

训练样本 · 神经元 · Bug · 监督 · 样本 ·

2023 年 4 月 5 日

MUFIN: Improving Neural Repair Models with Back-Translation

翻译：基于后翻译的神经元修复模型改进

André Silva,João F. Ferreira,He Ye,Martin Monperrus

Automated program repair is the task of automatically repairing software bugs. A promising direction in this field is self-supervised learning, a learning paradigm in which repair models are trained without commits representing pairs of bug/fix. In self-supervised neural program repair, those bug/fix pairs are generated in some ways. The main problem is to generate interesting and diverse pairs that maximize the effectiveness of training. As a contribution to this problem, we propose to use back-translation, a technique coming from neural machine translation. We devise and implement MUFIN, a back-translation training technique for program repair, with specifically designed code critics to select high-quality training samples. Our results show that MUFIN's back-translation loop generates valuable training samples in a fully automated, self-supervised manner, generating more than half-a-million pairs of bug/fix. The code critic design is key because of a fundamental trade-off between how restrictive a critic is and how many samples are available for optimization during back-translation.

翻译：自动程序修复是自动修复软件漏洞的任务，自我监督学习是该领域的一个有前途的方向，其学习范式是在不进行 bug/fix 套装匹配的情况下训练修复模型。在自我监督神经元修复中，使用某些方法生成那些 bug/fix 对。主要问题在于生成有趣和多样化的对，以最大程度地提高训练效果。作为对这一问题的贡献，我们建议使用来自神经机器翻译的后翻译技术。我们设计并实现了一个后翻译训练技术 MUFIN，其具有专门设计的代码批评家来选择高质量的训练样本。我们的结果表明，MUFIN 的后翻译循环以完全自动化、自我监督的方式生成有价值的训练样本，生成了超过 50 万对 bug/fix。代码批评家的设计非常关键，因为在后翻译优化期间，批评家的限制性和样本数量之间存在根本的权衡。

0

相关内容

训练样本

《机器学习模型中不确定性的量化和推理》CMU2022最新29页slides

《机器学习模型中不确定性的量化和推理》CMU2022最新29页slides

专知会员服务

56+阅读 · 2022年11月28日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【NLP| 推荐文章】用图递归网络解决图的NLP问题（Tackling Graphical NLP problems with Graph Recurrent Networks）

【NLP| 推荐文章】用图递归网络解决图的NLP问题（Tackling Graphical NLP problems with Graph Recurrent Networks）

专知会员服务

33+阅读 · 2019年11月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

量化金融强化学习论文集合

量化金融强化学习论文集合

专知

14+阅读 · 2019年12月18日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

R工程化—Rest API 之plumber包

R工程化—Rest API 之plumber包

R语言中文社区

11+阅读 · 2018年12月25日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

miR-21对脂肪干细胞移植修复外周神经损伤易发凋亡的干预作用及机制

国家自然科学基金

0+阅读 · 2015年12月31日

神经系统seipin缺失诱发精神迟滞的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

IRAK-M基因介导巨噬细胞亚群参与心肌梗死后瘢痕修复的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

多目标图像分割的稀疏表示方法

国家自然科学基金

0+阅读 · 2012年12月31日

对象模型上交互式修复生成技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

衰老对骨髓基质干细胞刺激心肌梗死后心脏前体细胞群重建的影响及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

ASIC1a对NMDAR的门控作用和脑缺血中神经元的损伤作用及神经保护研究

国家自然科学基金

0+阅读 · 2011年12月31日

Pirh2A泛素化调节p27Kip1在创伤性脑损伤神经修复过程中的作用及机制

国家自然科学基金

0+阅读 · 2011年12月31日

转录因子Nanog对小鼠原始生殖细胞发育的影响机制研究

国家自然科学基金

0+阅读 · 2010年12月31日

MiRNA对新生大鼠HIBD内源性神经干细胞增殖和分化的调控作用

国家自然科学基金

0+阅读 · 2008年12月31日

Solving Diffusion ODEs with Optimal Boundary Conditions for Better Image Super-Resolution

Solving Diffusion ODEs with Optimal Boundary Conditions for Better Image Super-Resolution

Arxiv

0+阅读 · 2023年5月24日

Unpaired Image-to-Image Translation via Neural Schrödinger Bridge

Arxiv

0+阅读 · 2023年5月24日

Iteratively Improving Speech Recognition and Voice Conversion

Arxiv

0+阅读 · 2023年5月24日

DuDGAN: Improving Class-Conditional GANs via Dual-Diffusion

Arxiv

0+阅读 · 2023年5月24日

LLM-powered Data Augmentation for Enhanced Crosslingual Performance

Arxiv

0+阅读 · 2023年5月23日

Improved Convergence of Score-Based Diffusion Models via Prediction-Correction

Arxiv

0+阅读 · 2023年5月23日

Duplex Diffusion Models Improve Speech-to-Speech Translation

Arxiv

0+阅读 · 2023年5月22日

Improved Sample Complexity Bounds for Distributionally Robust Reinforcement Learning

Arxiv

0+阅读 · 2023年5月20日

ComedicSpeech: Text To Speech For Stand-up Comedies in Low-Resource Scenarios

Arxiv

0+阅读 · 2023年5月20日

Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth

Arxiv

20+阅读 · 2021年5月10日

VIP会员

文章信息

相关主题

相关VIP内容

《机器学习模型中不确定性的量化和推理》CMU2022最新29页slides

《机器学习模型中不确定性的量化和推理》CMU2022最新29页slides

专知会员服务

56+阅读 · 2022年11月28日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【NLP| 推荐文章】用图递归网络解决图的NLP问题（Tackling Graphical NLP problems with Graph Recurrent Networks）

【NLP| 推荐文章】用图递归网络解决图的NLP问题（Tackling Graphical NLP problems with Graph Recurrent Networks）

专知会员服务

33+阅读 · 2019年11月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

量化金融强化学习论文集合

量化金融强化学习论文集合

专知

14+阅读 · 2019年12月18日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

R工程化—Rest API 之plumber包

R工程化—Rest API 之plumber包

R语言中文社区

11+阅读 · 2018年12月25日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Solving Diffusion ODEs with Optimal Boundary Conditions for Better Image Super-Resolution

Solving Diffusion ODEs with Optimal Boundary Conditions for Better Image Super-Resolution

Arxiv

0+阅读 · 2023年5月24日

Unpaired Image-to-Image Translation via Neural Schrödinger Bridge

Arxiv

0+阅读 · 2023年5月24日

Iteratively Improving Speech Recognition and Voice Conversion

Arxiv

0+阅读 · 2023年5月24日

DuDGAN: Improving Class-Conditional GANs via Dual-Diffusion

Arxiv

0+阅读 · 2023年5月24日

LLM-powered Data Augmentation for Enhanced Crosslingual Performance

Arxiv

0+阅读 · 2023年5月23日

Improved Convergence of Score-Based Diffusion Models via Prediction-Correction

Arxiv

0+阅读 · 2023年5月23日

Duplex Diffusion Models Improve Speech-to-Speech Translation

Arxiv

0+阅读 · 2023年5月22日

Improved Sample Complexity Bounds for Distributionally Robust Reinforcement Learning

Arxiv

0+阅读 · 2023年5月20日

ComedicSpeech: Text To Speech For Stand-up Comedies in Low-Resource Scenarios

Arxiv

0+阅读 · 2023年5月20日

Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth

Arxiv

20+阅读 · 2021年5月10日

相关基金

miR-21对脂肪干细胞移植修复外周神经损伤易发凋亡的干预作用及机制

国家自然科学基金

0+阅读 · 2015年12月31日

神经系统seipin缺失诱发精神迟滞的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

IRAK-M基因介导巨噬细胞亚群参与心肌梗死后瘢痕修复的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

多目标图像分割的稀疏表示方法

国家自然科学基金

0+阅读 · 2012年12月31日

对象模型上交互式修复生成技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

衰老对骨髓基质干细胞刺激心肌梗死后心脏前体细胞群重建的影响及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

ASIC1a对NMDAR的门控作用和脑缺血中神经元的损伤作用及神经保护研究

国家自然科学基金

0+阅读 · 2011年12月31日

Pirh2A泛素化调节p27Kip1在创伤性脑损伤神经修复过程中的作用及机制

国家自然科学基金

0+阅读 · 2011年12月31日

转录因子Nanog对小鼠原始生殖细胞发育的影响机制研究

国家自然科学基金

0+阅读 · 2010年12月31日

MiRNA对新生大鼠HIBD内源性神经干细胞增殖和分化的调控作用

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员