以通用方案编制语言对代码生成提出澄清问题 (Asking Clarification Questions for Code Generation in General-Purpose Programming Language) - 专知论文

会员服务 ·

0

代码 · Performer · 数据集 · MoDELS · 查准率/准确率 ·

2022 年 12 月 19 日

Asking Clarification Questions for Code Generation in General-Purpose Programming Language

翻译：以通用方案编制语言对代码生成提出澄清问题

Haau-Sing Li,Mohsen Mesgar,André F. T. Martins,Iryna Gurevych

from arxiv, 8 pages (excluding Limitations and Ethics Concerns), 4 figures, 7 tables

Code generation from text requires understanding the user's intent from a natural language description (NLD) and generating an executable program code snippet that satisfies this intent. While recent pretrained language models (PLMs) demonstrate remarkable performance for this task, these models fail when the given NLD is ambiguous due to the lack of enough specifications for generating a high-quality code snippet. In this work, we introduce a novel and more realistic setup for this task. We hypothesize that ambiguities in the specifications of an NLD are resolved by asking clarification questions (CQs). Therefore, we collect and introduce a new dataset named CodeClarQA containing NLD-Code pairs with created CQAs. We evaluate the performance of PLMs for code generation on our dataset. The empirical results support our hypothesis that clarifications result in more precise generated code, as shown by an improvement of 17.52 in BLEU, 12.72 in CodeBLEU, and 7.7\% in the exact match. Alongside this, our task and dataset introduce new challenges to the community, including when and what CQs should be asked.

翻译：从文本生成代码需要理解用户从自然语言描述(NLD)中的意图,并产生符合这一意图的可执行程序代码片段。虽然最近的预先培训的语言模式显示了这一任务的出色表现,但这些模式由于给给全国民主联盟的代码片段缺乏足够的规格以生成高质量代码片段而变得模糊不清,因而未能成功。在这项工作中,我们为这项任务引入了一个新颖和更加现实的设置。我们假想的是,一个全国民主联盟规格中的模糊之处是通过要求澄清问题来解决(CQs ) 。因此,我们收集并引入了一套新的数据集,名为代码ClarQA, 包含全国民主联盟- Code配有创建的 CQA的对。我们评估了在我们的数据集中生成代码的PLMs的性能。这些经验结果支持我们的假设,即澄清导致更精确生成代码,如BLEU的17.52、CLU的12.72和完全匹配的7.7 ⁇ 。我们的任务和数据集给社区带来了新的挑战,包括何时和什么应该问到的CQs。

0

相关内容

代码（Code）是专知网的一个重要知识资料文档板块，旨在整理收录论文源代码、复现代码，经典工程代码等，便于用户查阅下载使用。

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【东大-UCSB】虚假新闻检测的自然语言处理研究综述，A Survey on Natural Language Processing for Fake News Detection

【东大-UCSB】虚假新闻检测的自然语言处理研究综述，A Survey on Natural Language Processing for Fake News Detection

专知会员服务

79+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

虫草素诱导MG-M2表型转化上调NGF表达介导AD神经元保护作用的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

β-catenin/Ets1复合体在胶质母细胞瘤中对hTERT表达调控机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

Cofilin在Erucin诱导的乳腺癌细胞线粒体分裂和细胞凋亡中的作用及分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

去甲斑蝥素诱导鳞翅目昆虫细胞凋亡的线粒体途径研究

国家自然科学基金

0+阅读 · 2012年12月31日

胰岛素经炎性体途径诱导VSMC转化为炎性表型细胞在动脉粥样硬化中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

髓系抑制性细胞（MDSC）参与鼻咽癌免疫耐受的作用和调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

Ghrelin对胰岛β细胞分泌胰岛素和增殖的影响及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

CD226分子抗小鼠胸腺细胞凋亡的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

抑制Kupffer细胞HDAC11活性诱导大鼠肝移植免疫耐受

国家自然科学基金

0+阅读 · 2009年12月31日

Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data

Arxiv

0+阅读 · 2023年2月24日

Few-Shot Table-to-Text Generation with Prompt Planning and Knowledge Memorization

Arxiv

0+阅读 · 2023年2月24日

Few-Shot Table-to-Text Generation with Prompt-based Adapter

Arxiv

0+阅读 · 2023年2月24日

Active Prompting with Chain-of-Thought for Large Language Models

Arxiv

0+阅读 · 2023年2月23日

Black-box Prompt Learning for Pre-trained Language Models

Arxiv

0+阅读 · 2023年2月23日

A Systematic Survey on Deep Generative Models for Graph Generation

Arxiv

18+阅读 · 2022年10月4日

Sparse Structure Learning via Graph Neural Networks for Inductive Document Classification

Arxiv

10+阅读 · 2021年12月13日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

A Survey of the State of Explainable AI for Natural Language Processing

Arxiv

26+阅读 · 2020年10月1日

Text Classification Algorithms: A Survey

Arxiv

16+阅读 · 2020年5月20日

VIP会员

文章信息

相关主题

查准率/准确率

相关VIP内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【东大-UCSB】虚假新闻检测的自然语言处理研究综述，A Survey on Natural Language Processing for Fake News Detection

【东大-UCSB】虚假新闻检测的自然语言处理研究综述，A Survey on Natural Language Processing for Fake News Detection

专知会员服务

79+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data

Arxiv

0+阅读 · 2023年2月24日

Few-Shot Table-to-Text Generation with Prompt Planning and Knowledge Memorization

Arxiv

0+阅读 · 2023年2月24日

Few-Shot Table-to-Text Generation with Prompt-based Adapter

Arxiv

0+阅读 · 2023年2月24日

Active Prompting with Chain-of-Thought for Large Language Models

Arxiv

0+阅读 · 2023年2月23日

Black-box Prompt Learning for Pre-trained Language Models

Arxiv

0+阅读 · 2023年2月23日

A Systematic Survey on Deep Generative Models for Graph Generation

Arxiv

18+阅读 · 2022年10月4日

Sparse Structure Learning via Graph Neural Networks for Inductive Document Classification

Arxiv

10+阅读 · 2021年12月13日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

A Survey of the State of Explainable AI for Natural Language Processing

Arxiv

26+阅读 · 2020年10月1日

Text Classification Algorithms: A Survey

Arxiv

16+阅读 · 2020年5月20日

相关基金

虫草素诱导MG-M2表型转化上调NGF表达介导AD神经元保护作用的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

β-catenin/Ets1复合体在胶质母细胞瘤中对hTERT表达调控机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

Cofilin在Erucin诱导的乳腺癌细胞线粒体分裂和细胞凋亡中的作用及分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

去甲斑蝥素诱导鳞翅目昆虫细胞凋亡的线粒体途径研究

国家自然科学基金

0+阅读 · 2012年12月31日

胰岛素经炎性体途径诱导VSMC转化为炎性表型细胞在动脉粥样硬化中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

髓系抑制性细胞（MDSC）参与鼻咽癌免疫耐受的作用和调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

Ghrelin对胰岛β细胞分泌胰岛素和增殖的影响及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

CD226分子抗小鼠胸腺细胞凋亡的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

抑制Kupffer细胞HDAC11活性诱导大鼠肝移植免疫耐受

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员