Struct Coder: 代码生成的结构- 软件变换器 (StructCoder: Structure-Aware Transformer for Code Generation) - 专知论文

会员服务 ·

0

代码 · Learning · 变换 · state-of-the-art · 解码 ·

2022 年 6 月 10 日

StructCoder: Structure-Aware Transformer for Code Generation

翻译：Struct Coder: 代码生成的结构- 软件变换器

Sindhu Tipirneni,Ming Zhu,Chandan K. Reddy

There has been a recent surge of interest in automating software engineering tasks using deep learning. This work addresses the problem of code generation where the goal is to generate target code given source code in a different language or a natural language description. Most of the state-of-the-art deep learning models for code generation use training strategies that are primarily designed for natural language. However, understanding and generating code requires a more rigorous comprehension of the code syntax and semantics. With this motivation, we develop an encoder-decoder Transformer model where both the encoder and decoder are trained to recognize the syntax and data flow in the source and target codes, respectively. We not only make the encoder structure-aware by leveraging the source code's syntax tree and data flow graph, but we also ensure that our decoder preserves the syntax and data flow of the target code by introducing two auxiliary tasks: AST (Abstract Syntax Tree) paths prediction and data flow prediction. To the best of our knowledge, this is the first work to introduce a structure-aware Transformer decoder to enhance the quality of generated code by modeling target syntax and data flow. The proposed StructCoder model achieves state-of-the-art performance on code translation and text-to-code generation tasks in the CodeXGLUE benchmark.

翻译：最近对利用深层学习实现软件工程任务自动化的兴趣激增。这项工作解决了代码生成问题, 目标是以不同语言或自然语言描述生成目标代码源代码。用于代码生成的大多数最先进的深层次学习模式都使用主要为自然语言设计的训练战略。但是, 理解和生成代码需要更严格地理解代码语法和语义。有了这一动机, 我们开发了一个编码- 解码变异器模型, 使编码和解码器分别接受培训, 以识别源代码和目标代码中的语法和数据流。我们不仅通过利用源代码的语法树和数据流图来使编码结构认知, 我们还通过引入两个辅助任务, 即 AST (缩略语法树) 路径预测和数据流预测。对我们的知识而言, 这是在源代码和目标代码中引入结构- 系统变异器解码解码的首项工作, 将生成的代码化为模型和数据流化, 实现生成的代码质量。

0

相关内容

代码（Code）是专知网的一个重要知识资料文档板块，旨在整理收录论文源代码、复现代码，经典工程代码等，便于用户查阅下载使用。

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【干货书】开放数据结构，Open Data Structures，337页pdf

【干货书】开放数据结构，Open Data Structures，337页pdf

专知会员服务

17+阅读 · 2021年9月17日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

隐重子图条件下图的圈

国家自然科学基金

0+阅读 · 2015年12月31日

压缩感知与稀疏信号恢复

国家自然科学基金

2+阅读 · 2014年12月31日

解析函数的分形边界性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

不确定性可视分析关键技术研究

国家自然科学基金

8+阅读 · 2013年12月31日

软物质动力学系统阻尼效应的保结构分析方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于图的不变量与子图结构的谱极值问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

太阳能转换材料的第一性原理电子能带结构理论

国家自然科学基金

0+阅读 · 2011年12月31日

低维纳米结构弹性性能的尺寸效应研究

国家自然科学基金

0+阅读 · 2008年12月31日

Matching Visual Features to Hierarchical Semantic Topics for Image Paragraph Captioning

Arxiv

0+阅读 · 2022年7月26日

Self-Distilled Vision Transformer for Domain Generalization

Arxiv

0+阅读 · 2022年7月25日

Scaling Structured Inference with Randomization

Arxiv

0+阅读 · 2022年7月24日

Lyra: A Benchmark for Turducken-Style Code Generation

Arxiv

0+阅读 · 2022年7月24日

Code Structure Guided Transformer for Source Code Summarization

Arxiv

0+阅读 · 2022年7月22日

A Transferable Intersection Reconstruction Network for Traffic Speed Prediction

Arxiv

0+阅读 · 2022年7月22日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

Text Generation from Knowledge Graphs with Graph Transformers

Arxiv

35+阅读 · 2019年4月4日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【干货书】开放数据结构，Open Data Structures，337页pdf

【干货书】开放数据结构，Open Data Structures，337页pdf

专知会员服务

17+阅读 · 2021年9月17日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《战区安全决策课程体系》最新244页

《"无人机航母"原型平台》

任务规划与地形分析：现代复杂环境作战导航体系

《攻击场景描述形式化模型研究》

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

相关论文

Matching Visual Features to Hierarchical Semantic Topics for Image Paragraph Captioning

Arxiv

0+阅读 · 2022年7月26日

Self-Distilled Vision Transformer for Domain Generalization

Arxiv

0+阅读 · 2022年7月25日

Scaling Structured Inference with Randomization

Arxiv

0+阅读 · 2022年7月24日

Lyra: A Benchmark for Turducken-Style Code Generation

Arxiv

0+阅读 · 2022年7月24日

Code Structure Guided Transformer for Source Code Summarization

Arxiv

0+阅读 · 2022年7月22日

A Transferable Intersection Reconstruction Network for Traffic Speed Prediction

Arxiv

0+阅读 · 2022年7月22日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

Text Generation from Knowledge Graphs with Graph Transformers

Arxiv

35+阅读 · 2019年4月4日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

相关基金

隐重子图条件下图的圈

国家自然科学基金

0+阅读 · 2015年12月31日

压缩感知与稀疏信号恢复

国家自然科学基金

2+阅读 · 2014年12月31日

解析函数的分形边界性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

不确定性可视分析关键技术研究

国家自然科学基金

8+阅读 · 2013年12月31日

软物质动力学系统阻尼效应的保结构分析方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于图的不变量与子图结构的谱极值问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

太阳能转换材料的第一性原理电子能带结构理论

国家自然科学基金

0+阅读 · 2011年12月31日

低维纳米结构弹性性能的尺寸效应研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员