Code generation from text requires understanding the user's intent from a natural language description (NLD) and generating an executable program code snippet that satisfies this intent. While recent pretrained language models (PLMs) demonstrate remarkable performance for this task, these models fail when the given NLD is ambiguous due to the lack of enough specifications for generating a high-quality code snippet. In this work, we introduce a novel and more realistic setup for this task. We hypothesize that ambiguities in the specifications of an NLD are resolved by asking clarification questions (CQs). Therefore, we collect and introduce a new dataset named CodeClarQA containing NLD-Code pairs with created CQAs. We evaluate the performance of PLMs for code generation on our dataset. The empirical results support our hypothesis that clarifications result in more precise generated code, as shown by an improvement of 17.52 in BLEU, 12.72 in CodeBLEU, and 7.7\% in the exact match. Alongside this, our task and dataset introduce new challenges to the community, including when and what CQs should be asked.
翻译:从文本生成代码需要理解用户从自然语言描述(NLD)中的意图,并产生符合这一意图的可执行程序代码片段。虽然最近的预先培训的语言模式显示了这一任务的出色表现,但这些模式由于给给全国民主联盟的代码片段缺乏足够的规格以生成高质量代码片段而变得模糊不清,因而未能成功。在这项工作中,我们为这项任务引入了一个新颖和更加现实的设置。我们假想的是,一个全国民主联盟规格中的模糊之处是通过要求澄清问题来解决(CQs ) 。因此,我们收集并引入了一套新的数据集,名为代码ClarQA, 包含全国民主联盟- Code配有创建的 CQA的对。我们评估了在我们的数据集中生成代码的PLMs的性能。这些经验结果支持我们的假设,即澄清导致更精确生成代码,如BLEU的17.52、CLU的12.72和完全匹配的7.7 ⁇ 。我们的任务和数据集给社区带来了新的挑战,包括何时和什么应该问到的CQs。