Recently, neural techniques have been used to generate source code automatically. While promising for declarative languages, these approaches achieve much poorer performance on datasets for imperative languages. Since a declarative language is typically embedded in an imperative language (i.e., the turducken-style programming) in real-world software development, the promising results on declarative languages can hardly lead to significant reduction of manual software development efforts. In this paper, we define a new code generation task: given a natural language comment, this task aims to generate a program in a base imperative language with an embedded declarative language. To our knowledge, this is the first turducken-style code generation task. For this task, we present Lyra: a dataset in Python with embedded SQL. This dataset contains 2,000 carefully annotated database manipulation programs from real-world projects. Each program is paired with both a Chinese comment and an English comment. In our experiment, we adopted Transformer, BERT-style, and GPT-style models as baselines. In the best setting, the generation performance of GPT-style models is better than others, where the AST exact matching accuracy is 24% and 25.5% when using Chinese and English comments, respectively. Therefore, we believe that Lyra provides a new challenge for code generation. Yet, overcoming this challenge may significantly boost the applicability of code generation techniques for real-world software development.
翻译:最近,神经技术被用于自动生成源代码。 虽然对宣言语言有希望, 但这些方法在关键语言的数据集上的表现要差得多。 由于在现实世界软件开发中,宣示语言通常嵌入一种必备语言( 即涡轮式编程 ), 宣示语言的有希望的结果很难导致手动软件开发努力的显著减少。 在本文中, 我们定义了一个新的代码生成任务: 根据自然语言的评论, 任务旨在以基本必需语言生成一个程序, 并嵌入宣言语言 。 对我们的知识来说, 这是首个突变式代码生成任务 。 由于这个任务, 我们介绍Lyra: 在Python的数据集, 包含嵌入SQL。 这个数据集包含2,000个由真实世界项目附加注释的数据库操作程序。 每个程序都配有中国的评论和英语评论。 在我们的实验中, 我们采用了变换器、 ERT 和 GPT 模式作为基准。 在最佳设置中, GPT 型模型的生成性功能比其他人要好得多, 。 当AST ST 精确地认为 25 和 RUD 的代码可以提供新的格式时, 25 。