Code Generation aims at generating relevant code fragments according to given natural language descriptions. In the process of software development, there exist a large number of repetitive and low-tech code writing tasks, so code generation has received a lot of attention among academia and industry for assisting developers in coding. In fact, it has also been one of the key concerns in the field of software engineering to make machines understand users' requirements and write programs on their own. The recent development of deep learning techniques especially pre-training models make the code generation task achieve promising performance. In this paper, we systematically review the current work on deep learning-based code generation and classify the current deep learning-based code generation methods into three categories: methods based on code features, methods incorporated with retrieval, and methods incorporated with post-processing. The first category refers to the methods that use deep learning algorithms for code generation based on code features, and the second and third categories of methods improve the performance of the methods in the first category. In this paper, the existing research results of each category of methods are systematically reviewed, summarized and commented. The paper then summarizes and analyzes the corpus and the popular evaluation metrics used in the existing code generation work. Finally, the paper summarizes the overall literature review and provides a prospect on future research directions worthy of attention.
翻译:在软件开发过程中,存在大量重复和低技术的代码编写工作,因此,在学术界和行业中,代码生成在协助开发者的编码方面引起了大量注意;事实上,这也是软件工程领域的主要关切之一,使机器能够了解用户的要求,并自行编写程序;最近深层次学习技术的发展,特别是培训前模式的发展,使代码生成任务取得有希望的业绩;在本文件中,我们系统地审查当前关于深层次学习生成代码的工作,并将目前的深层次基于学习的代码生成方法分为三类:基于代码特征的方法、与检索相结合的方法和与后处理后处理相结合的方法;第一类是使用深层次学习算法进行代码生成的方法,根据代码特征为基础,第二和第三类方法改进了第三类方法的绩效;在本文件中,系统审查、总结和评论了每一类方法的现有研究成果;然后,我们系统地总结和分析了当前基于深层次学习的代码生成方法生成方法生成方法生成方法,并将目前的深层次基于学习的代码生成方法的生成方法分为三类:基于代码特征的方法、与检索相结合的方法和与后处理后采用的方法相结合的方法;第一类是使用基于代码生成总体研究方向的深入学习算法的方法;第二类方法改进方法;最后,提供了有价值的文件;最后,提供了关于现有代码生成研究方向的有价值的文献和前景的完整。</s>