Code generation aims to automatically generate a piece of code given an input natural language utterance. Currently, among dominant models, it is treated as a sequence-to-tree task, where a decoder outputs a sequence of actions corresponding to the pre-order traversal of an Abstract Syntax Tree. However, such a decoder only exploits the preorder traversal based preceding actions, which are insufficient to ensure correct action predictions. In this paper, we first throughly analyze the context modeling difference between neural code generation models with different traversals based decodings (preorder traversal vs breadth-first traversal), and then propose to introduce a mutual learning framework to jointly train these models. Under this framework, we continuously enhance both two models via mutual distillation, which involves synchronous executions of two one-to-one knowledge transfers at each training step. More specifically, we alternately choose one model as the student and the other as its teacher, and require the student to fit the training data and the action prediction distributions of its teacher. By doing so, both models can fully absorb the knowledge from each other and thus could be improved simultaneously. Experimental results and in-depth analysis on several benchmark datasets demonstrate the effectiveness of our approach. We release our code at https://github.com/DeepLearnXMU/CGML.
翻译:代码生成旨在自动生成含有输入自然语言的代码。 目前,在占主导地位的模型中,代码生成被视为一个序列到树木的任务,在其中,一个解码器输出出一个与简易语法树的预序曲程相应的一系列行动。然而,这样的解码器只利用了以前基于预序曲式的行动,而这些行动不足以确保正确的行动预测。在本文件中,我们首先通过分析背景模型来分析神经代码生成模型与基于不同曲解码(先行曲与宽度第一跨行)之间的差异,然后提议引入一个共同学习框架来联合培训这些模型。在此框架下,我们通过相互蒸馏不断加强这两个模型,这需要在每个培训阶段同步地执行两个一对一的知识传输。更具体地说,我们选择一个模型作为学生,另一个作为教师,要求学生匹配培训数据和其教师的行动预测分布。通过这样做,两种模型都可以完全吸收来自对方的知识,从而可以同时改进我们各自基准/L的数据。 实验结果可以同时演示我们的基准/L 。