通过阅读 API 文件为未知图书馆生成代码 (Code Generation for Unknown Libraries via Reading API Documentations) - 专知论文

会员服务 ·

0

API · MoDELS · 基准 · contrastive · 样例 ·

2022 年 2 月 16 日

Code Generation for Unknown Libraries via Reading API Documentations

翻译：通过阅读 API 文件为未知图书馆生成代码

Koki Washio,Yusuke Miyao

Open-domain code generation is a challenging problem because the set of functions and classes that we use are frequently changed and extended in programming communities. We consider the challenge of code generation for unknown libraries without additional training. In this paper, we explore a framework of code generation that can refer to relevant API documentations like human programmers to handle unknown libraries. As a first step of this direction, we implement a model that can extract relevant code signatures from API documentations based on a natural language intent and copy primitives from the extracted signatures. Moreover, to evaluate code generation for unknown libraries and our framework, we extend an existing dataset of open-domain code generation and resplit it so that the evaluation data consist of only examples using the libraries that do not appear in the training data. Experiments on our new split show that baseline encoder-decoder models cannot generate code using primitives of unknown libraries as expected. In contrast, our model outperforms the baseline on the new split and can properly generate unknown primitives when extracted code signatures are noiseless.

翻译：开放域代码生成是一个具有挑战性的问题,因为我们使用的功能和类别在编程社区中经常改变和扩展。我们考虑了在不额外培训的情况下为未知图书馆生成代码的挑战。在本文中, 我们探索了一个代码生成框架, 可以引用相关的 API 文档, 如人类编程程序员来处理未知的图书馆。作为这个方向的第一步, 我们实施了一个模型, 可以在自然语言意图的基础上从 API 文档中提取相关的代码签名, 并从提取的签名中复制原始文件。此外, 为了评估未知图书馆和我们的框架的代码生成, 我们扩展了一个开放式域代码生成的现有数据集, 并重新复制它, 这样评估数据只包含使用未出现在培训数据中的图书馆的示例。对我们新的分类实验显示, 基线编码解码器模式无法生成代码, 使用未知图书馆的原始数据。相反, 我们的模型超越了新拆解的基线, 当提取代码签名时, 我们的模型能够正确生成未知的原始数据是无噪音的。

1

相关内容

API

应用程序接口（简称 API），又称为应用编程接口，就是软件系统不同组成部分衔接的约定。

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

21+阅读 · 2022年3月18日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

若干类ABSDEs以及其他类型BSDEs的研究

国家自然科学基金

0+阅读 · 2015年12月31日

有限域上多项式的p-进与T-进指数和

国家自然科学基金

0+阅读 · 2013年12月31日

稀疏框架下信号瞬态成分提取及其机械故障预示研究

国家自然科学基金

0+阅读 · 2012年12月31日

全球三次样条格式数值模式动力框架与理想场试验研究

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

嵌段共聚物多级自组装模拟分子伴侣的结构与功能

国家自然科学基金

1+阅读 · 2011年12月31日

基于可持续性的大型公共建筑决策与设计研究

国家自然科学基金

0+阅读 · 2011年12月31日

并行数据和调查数据质量管理

国家自然科学基金

0+阅读 · 2011年12月31日

AIM和ELF理论方法及应用的新拓展

国家自然科学基金

0+阅读 · 2009年12月31日

基于MUAV平台的ARGIS扩展技术

国家自然科学基金

1+阅读 · 2009年12月31日

On the relative asymptotic expressivity of inference frameworks

Arxiv

0+阅读 · 2022年4月20日

Generating 3D Molecules for Target Protein Binding

Arxiv

0+阅读 · 2022年4月19日

Muffin: Testing Deep Learning Libraries via Neural Architecture Fuzzing

Arxiv

0+阅读 · 2022年4月19日

Modx: Binary Level Partial Imported Third-Party Library Detection through Program Modularization and Semantic Matching

Modx: Binary Level Partial Imported Third-Party Library Detection through Program Modularization and Semantic Matching

Arxiv

0+阅读 · 2022年4月18日

Retrieval-Free Knowledge-Grounded Dialogue Response Generation with Adapters

Arxiv

0+阅读 · 2022年4月18日

InCoder: A Generative Model for Code Infilling and Synthesis

Arxiv

0+阅读 · 2022年4月17日

SVIP: Sequence VerIfication for Procedures in Videos

Arxiv

0+阅读 · 2022年4月17日

UniGDD: A Unified Generative Framework for Goal-Oriented Document-Grounded Dialogue

Arxiv

0+阅读 · 2022年4月16日

Approximating Gradients for Differentiable Quality Diversity in Reinforcement Learning

Arxiv

0+阅读 · 2022年4月15日

Phase-aware Speech Enhancement with Deep Complex U-Net

Phase-aware Speech Enhancement with Deep Complex U-Net

Arxiv

15+阅读 · 2019年3月7日

VIP会员

文章信息

相关主题

相关VIP内容

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

21+阅读 · 2022年3月18日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

模型提取攻击与防御的系统综述：最新进展与展望

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

【CMU博士论文】用于物理模拟的高效深度学习模型

大模型解决方案白皮书：社交陪伴场景全流程落地指南

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

相关论文

On the relative asymptotic expressivity of inference frameworks

Arxiv

0+阅读 · 2022年4月20日

Generating 3D Molecules for Target Protein Binding

Arxiv

0+阅读 · 2022年4月19日

Muffin: Testing Deep Learning Libraries via Neural Architecture Fuzzing

Arxiv

0+阅读 · 2022年4月19日

Modx: Binary Level Partial Imported Third-Party Library Detection through Program Modularization and Semantic Matching

Modx: Binary Level Partial Imported Third-Party Library Detection through Program Modularization and Semantic Matching

Arxiv

0+阅读 · 2022年4月18日

Retrieval-Free Knowledge-Grounded Dialogue Response Generation with Adapters

Arxiv

0+阅读 · 2022年4月18日

InCoder: A Generative Model for Code Infilling and Synthesis

Arxiv

0+阅读 · 2022年4月17日

SVIP: Sequence VerIfication for Procedures in Videos

Arxiv

0+阅读 · 2022年4月17日

UniGDD: A Unified Generative Framework for Goal-Oriented Document-Grounded Dialogue

Arxiv

0+阅读 · 2022年4月16日

Approximating Gradients for Differentiable Quality Diversity in Reinforcement Learning

Arxiv

0+阅读 · 2022年4月15日

Phase-aware Speech Enhancement with Deep Complex U-Net

Phase-aware Speech Enhancement with Deep Complex U-Net

Arxiv

15+阅读 · 2019年3月7日

相关基金

若干类ABSDEs以及其他类型BSDEs的研究

国家自然科学基金

0+阅读 · 2015年12月31日

有限域上多项式的p-进与T-进指数和

国家自然科学基金

0+阅读 · 2013年12月31日

稀疏框架下信号瞬态成分提取及其机械故障预示研究

国家自然科学基金

0+阅读 · 2012年12月31日

全球三次样条格式数值模式动力框架与理想场试验研究

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

嵌段共聚物多级自组装模拟分子伴侣的结构与功能

国家自然科学基金

1+阅读 · 2011年12月31日

基于可持续性的大型公共建筑决策与设计研究

国家自然科学基金

0+阅读 · 2011年12月31日

并行数据和调查数据质量管理

国家自然科学基金

0+阅读 · 2011年12月31日

AIM和ELF理论方法及应用的新拓展

国家自然科学基金

0+阅读 · 2009年12月31日

基于MUAV平台的ARGIS扩展技术

国家自然科学基金

1+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员