PanGu-Coder:与功能级语言建模相结合的方案综述 (PanGu-Coder: Program Synthesis with Function-Level Language Modeling) - 专知论文

会员服务 ·

0

语言模型化 · MoDELS · 掩码语言模型化 · 广义函数 · 上下文窗口 ·

2022 年 7 月 22 日

PanGu-Coder: Program Synthesis with Function-Level Language Modeling

翻译：PanGu-Coder:与功能级语言建模相结合的方案综述

Fenia Christopoulou,Gerasimos Lampouras,Milan Gritta,Guchun Zhang,Yinpeng Guo,Zhongqi Li,Qi Zhang,Meng Xiao,Bo Shen,Lin Li,Hao Yu,Li Yan,Pingyi Zhou,Xin Wang,Yuchi Ma,Ignacio Iacobacci,Yasheng Wang,Guangtai Liang,Jiansheng Wei,Xin Jiang,Qianxiang Wang,Qun Liu

from arxiv, 27 pages

We present PanGu-Coder, a pretrained decoder-only language model adopting the PanGu-Alpha architecture for text-to-code generation, i.e. the synthesis of programming language solutions given a natural language problem description. We train PanGu-Coder using a two-stage strategy: the first stage employs Causal Language Modelling (CLM) to pre-train on raw programming language data, while the second stage uses a combination of Causal Language Modelling and Masked Language Modelling (MLM) training objectives that focus on the downstream task of text-to-code generation and train on loosely curated pairs of natural language program definitions and code functions. Finally, we discuss PanGu-Coder-FT, which is fine-tuned on a combination of competitive programming problems and code with continuous integration tests. We evaluate PanGu-Coder with a focus on whether it generates functionally correct programs and demonstrate that it achieves equivalent or better performance than similarly sized models, such as CodeX, while attending a smaller context window and training on less data.

翻译：我们介绍PanGu-Coder,这是采用PanGu-Alpha生成文本到代码结构的事先培训的单一解码器语言模型,即根据自然语言问题描述综合编程语言解决方案。我们用两阶段战略培训PanGu-Coder:第一阶段使用Causal语言建模(CLM)对原始编程语言数据进行预培训,而第二阶段则使用Causal语言建模和隐形语言建模(MLM)相结合的培训目标,重点是文本到代码生成的下游任务和关于自然语言方案定义和代码功能的松散组合的培训。最后,我们讨论PanGu-Coder-FT,它精细调整了竞争性编程问题和代码的组合,并不断进行整合测试。我们评估PanGu-Coder,重点是它是否产生功能正确的程序,并表明它比代码X等类似规模的模式取得等或更好的业绩,同时参加一个较小的背景窗口和数据较少的培训。

0

相关内容

语言模型化

语言模型化

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

专知会员服务

34+阅读 · 2022年3月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

家蚕基因组中未知转座子的注释及比较基因组学研究

国家自然科学基金

0+阅读 · 2014年12月31日

高维近似因子模型框架下的多重检验及其应用

国家自然科学基金

0+阅读 · 2013年12月31日

非高斯噪声驱动的无穷维随机动力系统的动力学研究

国家自然科学基金

1+阅读 · 2012年12月31日

多元整数值GARCH模型的统计分析

国家自然科学基金

0+阅读 · 2012年12月31日

随机扰动理论和随机算法在大规模矩阵计算中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

泛素蛋白酶体通路基因SNPs与晚期食管鳞癌紫杉醇敏感性

国家自然科学基金

0+阅读 · 2012年12月31日

非参数似然方法及其应用

国家自然科学基金

3+阅读 · 2012年12月31日

浸入边界法的高效稳定数值格式

国家自然科学基金

0+阅读 · 2012年12月31日

基于ITIL的应急响应决策支持系统研究

国家自然科学基金

2+阅读 · 2009年12月31日

基于随机图模型的蛋白质三级结构预测算法研究

国家自然科学基金

1+阅读 · 2008年12月31日

Constrained Density Matching and Modeling for Cross-lingual Alignment of Contextualized Representations

Arxiv

0+阅读 · 2022年9月18日

Selective Token Generation for Few-shot Natural Language Generation

Arxiv

0+阅读 · 2022年9月17日

Dataset Inference for Self-Supervised Models

Arxiv

0+阅读 · 2022年9月16日

Code as Policies: Language Model Programs for Embodied Control

Arxiv

0+阅读 · 2022年9月16日

Generalized Representations Learning for Time Series Classification

Arxiv

0+阅读 · 2022年9月15日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

Pix2seq: A Language Modeling Framework for Object Detection

Arxiv

10+阅读 · 2021年9月22日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

Temporal Relational Modeling with Self-Supervision for Action Segmentation

Arxiv

13+阅读 · 2020年12月14日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

VIP会员

文章信息

相关主题

语言模型化

掩码语言模型化

上下文窗口

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

专知会员服务

34+阅读 · 2022年3月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Constrained Density Matching and Modeling for Cross-lingual Alignment of Contextualized Representations

Arxiv

0+阅读 · 2022年9月18日

Selective Token Generation for Few-shot Natural Language Generation

Arxiv

0+阅读 · 2022年9月17日

Dataset Inference for Self-Supervised Models

Arxiv

0+阅读 · 2022年9月16日

Code as Policies: Language Model Programs for Embodied Control

Arxiv

0+阅读 · 2022年9月16日

Generalized Representations Learning for Time Series Classification

Arxiv

0+阅读 · 2022年9月15日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

Pix2seq: A Language Modeling Framework for Object Detection

Arxiv

10+阅读 · 2021年9月22日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

Temporal Relational Modeling with Self-Supervision for Action Segmentation

Arxiv

13+阅读 · 2020年12月14日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

相关基金

家蚕基因组中未知转座子的注释及比较基因组学研究

国家自然科学基金

0+阅读 · 2014年12月31日

高维近似因子模型框架下的多重检验及其应用

国家自然科学基金

0+阅读 · 2013年12月31日

非高斯噪声驱动的无穷维随机动力系统的动力学研究

国家自然科学基金

1+阅读 · 2012年12月31日

多元整数值GARCH模型的统计分析

国家自然科学基金

0+阅读 · 2012年12月31日

随机扰动理论和随机算法在大规模矩阵计算中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

泛素蛋白酶体通路基因SNPs与晚期食管鳞癌紫杉醇敏感性

国家自然科学基金

0+阅读 · 2012年12月31日

非参数似然方法及其应用

国家自然科学基金

3+阅读 · 2012年12月31日

浸入边界法的高效稳定数值格式

国家自然科学基金

0+阅读 · 2012年12月31日

基于ITIL的应急响应决策支持系统研究

国家自然科学基金

2+阅读 · 2009年12月31日

基于随机图模型的蛋白质三级结构预测算法研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员