CocGen:多发方案合成代码的开放大语言模式 (CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis) - 专知论文

会员服务 ·

0

语言模型化 · MoDELS · state-of-the-art · 分解的 · 代码 ·

2022 年 9 月 29 日

CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis

翻译：CocGen:多发方案合成代码的开放大语言模式

Erik Nijkamp,Bo Pang,Hiroaki Hayashi,Lifu Tu,Huan Wang,Yingbo Zhou,Silvio Savarese,Caiming Xiong

Program synthesis strives to generate a computer program as a solution to a given problem specification, expressed with input-output examples or natural language descriptions. The prevalence of large language models advances the state-of-the-art for program synthesis, though limited training resources and data impede open access to such models. To democratize this, we train and release a family of large language models up to 16.1B parameters, called CODEGEN, on natural language and programming language data, and open source the training library JAXFORMER. We show the utility of the trained model by demonstrating that it is competitive with the previous state-of-the-art on zero-shot Python code generation on HumanEval. We further investigate the multi-step paradigm for program synthesis, where a single program is factorized into multiple prompts specifying subproblems. To this end, we construct an open benchmark, Multi-Turn Programming Benchmark (MTPB), consisting of 115 diverse problem sets that are factorized into multi-turn prompts. Our analysis on MTPB shows that the same intent provided to CODEGEN in multi-turn fashion significantly improves program synthesis over that provided as a single turn. We make the training library JAXFORMER and model checkpoints available as open source contribution: https://github.com/salesforce/CodeGen.

翻译：以投入-产出实例或自然语言描述来表示,大型语言模型的普及程度提高了用于方案合成的先进水平,尽管有限的培训资源和数据妨碍了开放使用这种模型。为了实现这一点民主化,我们培训和释放一个拥有16.1B参数的大语言模型的大家庭,称为CODEGEN,关于自然语言和编程语言数据,以及培训图书馆JAXFORMER的公开来源。我们展示了经过培训的模型的效用,展示了该模型与之前的零发Python代码生成人文的先进工艺相比具有竞争力。我们进一步调查了方案合成的多步模式,其中将一个单一方案纳入多个提示性参数,具体说明子问题子。为此,我们建立了一个公开基准,即多发式规划基准,由115个不同的问题集组成,并纳入多转速度提示。我们对MTPB的分析表明,以多转式模式向CODEGEN提供的相同意图大大改进了作为单一来源的AFORA/FORA/ODFA/ODFRANSLA/FORG 。我们建立了一个单一来源。

0

相关内容

语言模型化

语言模型化

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【微软亚洲研究院】CodeBERT:用于编程和自然语言的预训练模型，CodeBERT: A Pre-Trained Model for Programming and Natural Languages

【微软亚洲研究院】CodeBERT:用于编程和自然语言的预训练模型，CodeBERT: A Pre-Trained Model for Programming and Natural Languages

专知会员服务

32+阅读 · 2020年2月21日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

日光温室热压-风压耦合通风机理及计算模型构建

国家自然科学基金

0+阅读 · 2015年12月31日

不同功能型植物叶片氮分配的海拔响应研究

国家自然科学基金

1+阅读 · 2014年12月31日

用于微流体空化的MEMS聚焦超声波器件研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于压缩感知的CMOS 图像传感器关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于A-Train卫星观测的沙尘暴数字重构技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

机载阵列下视SAR高分辨率成像模型与处理方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

滇西老厂富银红土型锰矿次生富集机制及40Ar/39Ar年龄

国家自然科学基金

0+阅读 · 2012年12月31日

基于数据驱动的雾天大气成像模型研究

国家自然科学基金

0+阅读 · 2011年12月31日

面向湖泊水色遥感的多源数据融合与生成研究

国家自然科学基金

0+阅读 · 2009年12月31日

TLP: A Deep Learning-based Cost Model for Tensor Program Tuning

Arxiv

0+阅读 · 2022年11月7日

An Evolutionary Approach to Dynamic Introduction of Tasks in Large-scale Multitask Learning Systems

Arxiv

0+阅读 · 2022年11月6日

Bridging Speech and Textual Pre-trained Models with Unsupervised ASR

Arxiv

0+阅读 · 2022年11月6日

Improving Speech Prosody of Audiobook Text-to-Speech Synthesis with Acoustic and Textual Contexts

Arxiv

0+阅读 · 2022年11月4日

Logographic Information Aids Learning Better Representations for Natural Language Inference

Arxiv

0+阅读 · 2022年11月3日

Revisiting Language Support for Generic Programming: When Genericity Is a Core Design Goal

Arxiv

0+阅读 · 2022年11月3日

Open-Vocabulary Argument Role Prediction for Event Extraction

Arxiv

0+阅读 · 2022年11月3日

Generative Models as a Data Source for Multiview Representation Learning

Arxiv

16+阅读 · 2021年6月9日

Entity Context and Relational Paths for Knowledge Graph Completion

Arxiv

29+阅读 · 2020年2月17日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

VIP会员

文章信息

相关主题

语言模型化

state-of-the-art

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【微软亚洲研究院】CodeBERT:用于编程和自然语言的预训练模型，CodeBERT: A Pre-Trained Model for Programming and Natural Languages

【微软亚洲研究院】CodeBERT:用于编程和自然语言的预训练模型，CodeBERT: A Pre-Trained Model for Programming and Natural Languages

专知会员服务

32+阅读 · 2020年2月21日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【伯克利博士论文】通过真实世界实践赋能机器人自主性

军用无人机集群技术尚未成熟——但潜力可期

人工智能安全治理白皮书（2025）

AgentOps综述：分类、挑战与未来方向

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

TLP: A Deep Learning-based Cost Model for Tensor Program Tuning

Arxiv

0+阅读 · 2022年11月7日

An Evolutionary Approach to Dynamic Introduction of Tasks in Large-scale Multitask Learning Systems

Arxiv

0+阅读 · 2022年11月6日

Bridging Speech and Textual Pre-trained Models with Unsupervised ASR

Arxiv

0+阅读 · 2022年11月6日

Improving Speech Prosody of Audiobook Text-to-Speech Synthesis with Acoustic and Textual Contexts

Arxiv

0+阅读 · 2022年11月4日

Logographic Information Aids Learning Better Representations for Natural Language Inference

Arxiv

0+阅读 · 2022年11月3日

Revisiting Language Support for Generic Programming: When Genericity Is a Core Design Goal

Arxiv

0+阅读 · 2022年11月3日

Open-Vocabulary Argument Role Prediction for Event Extraction

Arxiv

0+阅读 · 2022年11月3日

Generative Models as a Data Source for Multiview Representation Learning

Arxiv

16+阅读 · 2021年6月9日

Entity Context and Relational Paths for Knowledge Graph Completion

Arxiv

29+阅读 · 2020年2月17日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

相关基金

日光温室热压-风压耦合通风机理及计算模型构建

国家自然科学基金

0+阅读 · 2015年12月31日

不同功能型植物叶片氮分配的海拔响应研究

国家自然科学基金

1+阅读 · 2014年12月31日

用于微流体空化的MEMS聚焦超声波器件研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于压缩感知的CMOS 图像传感器关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于A-Train卫星观测的沙尘暴数字重构技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

机载阵列下视SAR高分辨率成像模型与处理方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

滇西老厂富银红土型锰矿次生富集机制及40Ar/39Ar年龄

国家自然科学基金

0+阅读 · 2012年12月31日

基于数据驱动的雾天大气成像模型研究

国家自然科学基金

0+阅读 · 2011年12月31日

面向湖泊水色遥感的多源数据融合与生成研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员