LEW: 学习校验具有执行功能的语言到代码生成 (LEVER: Learning to Verify Language-to-Code Generation with Execution) - 专知论文

会员服务 ·

0

state-of-the-art · CASES · Learning · 样本 · 自动问答 ·

2023 年 2 月 16 日

LEVER: Learning to Verify Language-to-Code Generation with Execution

翻译：LEW: 学习校验具有执行功能的语言到代码生成

Ansong Ni,Srini Iyer,Dragomir Radev,Ves Stoyanov,Wen-tau Yih,Sida I. Wang,Xi Victoria Lin

from arxiv, 23 pages

The advent of pre-trained code language models (CodeLMs) has lead to significant progress in language-to-code generation. State-of-the-art approaches in this area combine CodeLM decoding with sample pruning and reranking using test cases or heuristics based on the execution results. However, it is challenging to obtain test cases for many real-world language-to-code applications, and heuristics cannot well capture the semantic features of the execution results, such as data type and value range, which often indicates the correctness of the program. In this work, we propose LEVER, a simple approach to improve language-to-code generation by learning to verify the generated programs with their execution results. Specifically, we train verifiers to determine whether a program sampled from the CodeLM is correct or not based on the natural language input, the program itself and its execution results. The sampled programs are reranked by combining the verification score with the CodeLM generation probability, and marginalizing over programs with the same execution results. On four datasets across the domains of table QA, math QA and basic Python programming, LEVER consistently improves over the base CodeLMs (4.6% to 10.9% with code-davinci-002) and achieves new state-of-the-art results on all of them.

翻译：经过预先训练的代码语言模型(CodeLM)的出现导致语言对代码生成的显著进展。在这一领域,最先进的方法将代码LM的解码与根据执行结果对测试案例或超自然学进行抽样处理和重新排序相结合。然而,要获得许多真实世界语言对代码应用程序的测试案例,以及超自然学无法很好地捕捉执行结果的语义特征,如数据类型和价值范围,这往往表明程序是否正确。在这项工作中,我们建议LEWER, 一种简单的方法,通过学习校验生成的程序及其执行结果来改进语言对代码生成。具体地说,我们培训验证员,以确定从代码LM的样本程序是否正确,是否基于自然语言输入、程序本身及其执行结果。抽样程序由于将核查得分与代码代码CodealMM的生成概率相结合,以及将程序与相同的执行结果边缘化而重新排序。在表格的四套数据集中, QA、数学- Q-A 和基本代码(4-MVER- MAS- bas- bas- bas- bas- bas- bas- bas- bas- bast- bas- bas- bas- bas- bas- mass- bas- bal- bas- bas- bas- bas- brod- bal- brod- mass- mass- bal- bal- mass- mass- bal- mass- mass- mass- bal- bal- bal- bal- mess- bal- bal- bal- bal- bal- bal- mess- mess- mess- mess- mess- mess- mess- mal- mass- mess- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal- mal

0

相关内容

state-of-the-art

state-of-the-art

多伦多大学最新《机器学习导论》课程，Introduction to Machine Learning

多伦多大学最新《机器学习导论》课程，Introduction to Machine Learning

专知会员服务

25+阅读 · 2020年9月24日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

【Google可解释人工智能白皮书】27页pdf，AI Explainability Whitepaper ，Introduction to AI Explanations for AI Platform

【Google可解释人工智能白皮书】27页pdf，AI Explainability Whitepaper ，Introduction to AI Explanations for AI Platform

专知会员服务

127+阅读 · 2019年12月13日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

19+阅读 · 2017年12月17日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

随机偏微分方程

国家自然科学基金

5+阅读 · 2017年12月31日

高阶微分方程的周期解及多重性

国家自然科学基金

0+阅读 · 2015年12月31日

国产盆距兰属(Gastrochilus)的分类修订

国家自然科学基金

0+阅读 · 2015年12月31日

复杂数据模型中的分布逼近方法

国家自然科学基金

3+阅读 · 2014年12月31日

膜蛋白介导受IRES调控的cyclin B1促进食管癌转移的作用机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于压应力制造的高速超声复合加工表面层演化规律及疲劳行为研究

国家自然科学基金

0+阅读 · 2014年12月31日

关于带跳的随机发展方程的Engelbert定理及其应用

国家自然科学基金

0+阅读 · 2013年12月31日

附睾中microRNA与雄激素受体的相互调控机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

随机变分不等式

国家自然科学基金

0+阅读 · 2011年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

Use the Detection Transformer as a Data Augmenter

Arxiv

0+阅读 · 2023年4月10日

A Unified Contrastive Transfer Framework with Propagation Structure for Boosting Low-Resource Rumor Detection

Arxiv

0+阅读 · 2023年4月9日

Comparing Code Explanations Created by Students and Large Language Models

Arxiv

0+阅读 · 2023年4月8日

Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation

Arxiv

0+阅读 · 2023年4月8日

Towards Generating Functionally Correct Code Edits from Natural Language Issue Descriptions

Arxiv

0+阅读 · 2023年4月7日

InstantBooth: Personalized Text-to-Image Generation without Test-Time Finetuning

Arxiv

0+阅读 · 2023年4月6日

Machine Translation from Signed to Spoken Languages: State of the Art and Challenges

Arxiv

0+阅读 · 2023年4月5日

RunBugRun -- An Executable Dataset for Automated Program Repair

Arxiv

0+阅读 · 2023年4月3日

Querying Large Language Models with SQL

Arxiv

0+阅读 · 2023年4月2日

Controllable Data Generation by Deep Learning: A Review

Arxiv

15+阅读 · 2022年7月19日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

多伦多大学最新《机器学习导论》课程，Introduction to Machine Learning

多伦多大学最新《机器学习导论》课程，Introduction to Machine Learning

专知会员服务

25+阅读 · 2020年9月24日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

【Google可解释人工智能白皮书】27页pdf，AI Explainability Whitepaper ，Introduction to AI Explanations for AI Platform

【Google可解释人工智能白皮书】27页pdf，AI Explainability Whitepaper ，Introduction to AI Explanations for AI Platform

专知会员服务

127+阅读 · 2019年12月13日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《巡飞弹药（爆炸性无人机）威胁态势分析》最新24页报告

《军用后勤无人机：破解战场运输挑战的创新方案》

人工智能战争：以色列、伊朗与新型AI战争形态

《俄乌战争：现代战争未来的启示与经验》

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

19+阅读 · 2017年12月17日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

相关论文

Use the Detection Transformer as a Data Augmenter

Arxiv

0+阅读 · 2023年4月10日

A Unified Contrastive Transfer Framework with Propagation Structure for Boosting Low-Resource Rumor Detection

Arxiv

0+阅读 · 2023年4月9日

Comparing Code Explanations Created by Students and Large Language Models

Arxiv

0+阅读 · 2023年4月8日

Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation

Arxiv

0+阅读 · 2023年4月8日

Towards Generating Functionally Correct Code Edits from Natural Language Issue Descriptions

Arxiv

0+阅读 · 2023年4月7日

InstantBooth: Personalized Text-to-Image Generation without Test-Time Finetuning

Arxiv

0+阅读 · 2023年4月6日

Machine Translation from Signed to Spoken Languages: State of the Art and Challenges

Arxiv

0+阅读 · 2023年4月5日

RunBugRun -- An Executable Dataset for Automated Program Repair

Arxiv

0+阅读 · 2023年4月3日

Querying Large Language Models with SQL

Arxiv

0+阅读 · 2023年4月2日

Controllable Data Generation by Deep Learning: A Review

Arxiv

15+阅读 · 2022年7月19日

相关基金

随机偏微分方程

国家自然科学基金

5+阅读 · 2017年12月31日

高阶微分方程的周期解及多重性

国家自然科学基金

0+阅读 · 2015年12月31日

国产盆距兰属(Gastrochilus)的分类修订

国家自然科学基金

0+阅读 · 2015年12月31日

复杂数据模型中的分布逼近方法

国家自然科学基金

3+阅读 · 2014年12月31日

膜蛋白介导受IRES调控的cyclin B1促进食管癌转移的作用机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于压应力制造的高速超声复合加工表面层演化规律及疲劳行为研究

国家自然科学基金

0+阅读 · 2014年12月31日

关于带跳的随机发展方程的Engelbert定理及其应用

国家自然科学基金

0+阅读 · 2013年12月31日

附睾中microRNA与雄激素受体的相互调控机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

随机变分不等式

国家自然科学基金

0+阅读 · 2011年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员