基于深层学习的代码生成:简要回顾 (Code Generation Based on Deep Learning: a Brief Review)

Automatic software development has been a research hot spot in the field of software engineering (SE) in the past decade. In particular, deep learning (DL) has been applied and achieved a lot of progress in various SE tasks. Among all applications, automatic code generation by machines as a general concept, including code completion and code synthesis, is a common expectation in the field of SE, which may greatly reduce the development burden of the software developers and improves the efficiency and quality of the software development process to a certain extent. Code completion is an important part of modern integrated development environments (IDEs). Code completion technology effectively helps programmers complete code class names, method names, and key-words, etc., which improves the efficiency of program development and reduces spelling errors in the coding process. Such tools use static analysis on the code and provide candidates for completion arranged in alphabetical order. Code synthesis is implemented from two aspects, one based on input-output samples and the other based on functionality description. In this study, we introduce existing techniques of these two aspects and the corresponding DL techniques, and present some possible future research directions.

翻译：在过去十年中,自动软件开发一直是软件工程领域的研究热点,特别是应用了深入学习(DL),并在各种SE任务中取得了许多进展;在所有应用中,机械自动代码生成作为一般概念,包括代码完成和代码合成,是SE领域的共同期望,这可能会大大减轻软件开发者的发展负担,并在一定程度上提高软件开发过程的效率和质量;完成代码是现代综合开发环境的一个重要组成部分;完成代码技术有效地帮助程序设计者完成代码类名称、方法名称和关键词等,从而提高程序开发的效率,减少编译过程中的拼写错误;这些工具使用对代码的静态分析,为按字母顺序安排的完成提供候选人;从两个方面实施代码合成,一个方面基于投入-产出样本,另一个方面基于功能描述;在本研究中,我们介绍这两个方面的现有技术以及相应的DL技术,并提出一些可能的未来研究方向。