检索增强代码生成和汇总 (Retrieval Augmented Code Generation and Summarization) - 专知论文

会员服务 ·

0

Extensibility · 单峰值 · 查全率/召回率 · state-of-the-art · 示例 ·

2021 年 9 月 10 日

Retrieval Augmented Code Generation and Summarization

翻译：检索增强代码生成和汇总

Md Rizwan Parvez,Wasi Uddin Ahmad,Saikat Chakraborty,Baishakhi Ray,Kai-Wei Chang

from arxiv, accepted in EMNLP-Findings 2021

Software developers write a lot of source code and documentation during software development. Intrinsically, developers often recall parts of source code or code summaries that they had written in the past while implementing software or documenting them. To mimic developers' code or summary generation behavior, we propose a retrieval augmented framework, REDCODER, that retrieves relevant code or summaries from a retrieval database and provides them as a supplement to code generation or summarization models. REDCODER has a couple of uniqueness. First, it extends the state-of-the-art dense retrieval technique to search for relevant code or summaries. Second, it can work with retrieval databases that include unimodal (only code or natural language description) or bimodal instances (code-description pairs). We conduct experiments and extensive analysis on two benchmark datasets of code generation and summarization in Java and Python, and the promising results endorse the effectiveness of our proposed retrieval augmented framework.

翻译：软件开发商在软件开发过程中写了很多源代码和文件。本质上, 开发商经常回忆他们过去在使用软件或记录软件时所写的源代码或代码摘要的部分内容。为了模仿开发商的代码或简易生成行为, 我们提议了一个检索强化框架, REDCODER, 从检索数据库中检索相关的代码或摘要, 并将其作为代码生成或概要化模型的补充。 REDCODER 有一些独特性。首先, 它扩大了最先进的密集检索技术, 以搜索相关的代码或摘要。其次, 它可以与检索数据库合作, 其中包括单式( 仅代码或自然语言描述) 或双式实例( 代码描述对配对 ) 。我们实验和广泛分析在爪哇和 Python 的代码生成和合成两个基准数据集, 并且有希望的结果认可了我们提议的检索增强框架的有效性。

1

相关内容

Extensibility

iOS 8 提供的应用间和应用跟系统的功能交互特性。

Today (iOS and OS X): widgets for the Today view of Notification Center
Share (iOS and OS X): post content to web services or share content with others
Actions (iOS and OS X): app extensions to view or manipulate inside another app
Photo Editing (iOS): edit a photo or video in Apple's Photos app with extensions from a third-party apps
Finder Sync (OS X): remote file storage in the Finder with support for Finder content annotation
Storage Provider (iOS): an interface between files inside an app and other apps on a user's device
Custom Keyboard (iOS): system-wide alternative keyboards

Source: iOS 8 Extensions: Apple’s Plan for a Powerful App Ecosystem

【微软亚洲研究院】CodeBERT:用于编程和自然语言的预训练模型，CodeBERT: A Pre-Trained Model for Programming and Natural Languages

【微软亚洲研究院】CodeBERT:用于编程和自然语言的预训练模型，CodeBERT: A Pre-Trained Model for Programming and Natural Languages

专知会员服务

32+阅读 · 2020年2月21日

【2020新书】Python大数据处理，Mastering Large Datasets with Python

【2020新书】Python大数据处理，Mastering Large Datasets with Python

专知会员服务

54+阅读 · 2020年2月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

计算机 | IUI 2020等国际会议信息4条

计算机 | IUI 2020等国际会议信息4条

Call4Papers

6+阅读 · 2019年6月17日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

carla 学习笔记

carla 学习笔记

CreateAMind

9+阅读 · 2018年2月7日

人工智能 | 国际会议/SCI期刊约稿信息9条

人工智能 | 国际会议/SCI期刊约稿信息9条

Call4Papers

3+阅读 · 2018年1月12日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

Open-book Video Captioning with Retrieve-Copy-Generate Network

Arxiv

7+阅读 · 2021年3月9日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

Web Table Extraction, Retrieval and Augmentation: A Survey

Arxiv

7+阅读 · 2020年2月5日

Text-to-Image Synthesis Based on Machine Generated Captions

Text-to-Image Synthesis Based on Machine Generated Captions

Arxiv

3+阅读 · 2019年10月9日

FIGR: Few-shot Image Generation with Reptile

FIGR: Few-shot Image Generation with Reptile

Arxiv

5+阅读 · 2019年1月8日

VIP会员

文章信息

相关主题

查全率/召回率

state-of-the-art

相关VIP内容

【微软亚洲研究院】CodeBERT:用于编程和自然语言的预训练模型，CodeBERT: A Pre-Trained Model for Programming and Natural Languages

【微软亚洲研究院】CodeBERT:用于编程和自然语言的预训练模型，CodeBERT: A Pre-Trained Model for Programming and Natural Languages

专知会员服务

32+阅读 · 2020年2月21日

【2020新书】Python大数据处理，Mastering Large Datasets with Python

【2020新书】Python大数据处理，Mastering Large Datasets with Python

专知会员服务

54+阅读 · 2020年2月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《北约联合仿真与集成、验证与鉴定服务标准》2025最新40页

《面向协同任务的无人地面车辆与无人机（UGV-UAV）集成研究综述》2025最新综述论文

《理解大语言模型在军事战术任务规划中的局限性》

《国防与安全会议论文集》最新80页

相关资讯

计算机 | IUI 2020等国际会议信息4条

计算机 | IUI 2020等国际会议信息4条

Call4Papers

6+阅读 · 2019年6月17日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

carla 学习笔记

carla 学习笔记

CreateAMind

9+阅读 · 2018年2月7日

人工智能 | 国际会议/SCI期刊约稿信息9条

人工智能 | 国际会议/SCI期刊约稿信息9条

Call4Papers

3+阅读 · 2018年1月12日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

相关论文

Open-book Video Captioning with Retrieve-Copy-Generate Network

Arxiv

7+阅读 · 2021年3月9日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

Web Table Extraction, Retrieval and Augmentation: A Survey

Arxiv

7+阅读 · 2020年2月5日

Text-to-Image Synthesis Based on Machine Generated Captions

Text-to-Image Synthesis Based on Machine Generated Captions

Arxiv

3+阅读 · 2019年10月9日

FIGR: Few-shot Image Generation with Reptile

FIGR: Few-shot Image Generation with Reptile

Arxiv

5+阅读 · 2019年1月8日

微信扫码咨询专知VIP会员