通过相互学习改进对代码生成的树木结构化解码器的培训 (Improving Tree-Structured Decoder Training for Code Generation via Mutual Learning)

Code generation aims to automatically generate a piece of code given an input natural language utterance. Currently, among dominant models, it is treated as a sequence-to-tree task, where a decoder outputs a sequence of actions corresponding to the pre-order traversal of an Abstract Syntax Tree. However, such a decoder only exploits the preorder traversal based preceding actions, which are insufficient to ensure correct action predictions. In this paper, we first throughly analyze the context modeling difference between neural code generation models with different traversals based decodings (preorder traversal vs breadth-first traversal), and then propose to introduce a mutual learning framework to jointly train these models. Under this framework, we continuously enhance both two models via mutual distillation, which involves synchronous executions of two one-to-one knowledge transfers at each training step. More specifically, we alternately choose one model as the student and the other as its teacher, and require the student to fit the training data and the action prediction distributions of its teacher. By doing so, both models can fully absorb the knowledge from each other and thus could be improved simultaneously. Experimental results and in-depth analysis on several benchmark datasets demonstrate the effectiveness of our approach. We release our code at https://github.com/DeepLearnXMU/CGML.

翻译：代码生成旨在自动生成含有输入自然语言的代码。目前,在占主导地位的模型中,代码生成被视为一个序列到树木的任务,在其中,一个解码器输出出一个与简易语法树的预序曲程相应的一系列行动。然而,这样的解码器只利用了以前基于预序曲式的行动,而这些行动不足以确保正确的行动预测。在本文件中,我们首先通过分析背景模型来分析神经代码生成模型与基于不同曲解码(先行曲与宽度第一跨行)之间的差异,然后提议引入一个共同学习框架来联合培训这些模型。在此框架下,我们通过相互蒸馏不断加强这两个模型,这需要在每个培训阶段同步地执行两个一对一的知识传输。更具体地说,我们选择一个模型作为学生,另一个作为教师,要求学生匹配培训数据和其教师的行动预测分布。通过这样做,两种模型都可以完全吸收来自对方的知识,从而可以同时改进我们各自基准/L的数据。实验结果可以同时演示我们的基准/L 。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/