Code generation focuses on the automatic conversion of natural language (NL) utterances into code snippets. The sequence-to-tree (Seq2Tree) methods, e.g., TRANX, are proposed for code generation, with the guarantee of the compilability of the generated code, which generate the subsequent Abstract Syntax Tree (AST) node relying on antecedent predictions of AST nodes. Existing Seq2Tree methods tend to treat both antecedent predictions and subsequent predictions equally. However, under the AST constraints, it is difficult for Seq2Tree models to produce the correct subsequent prediction based on incorrect antecedent predictions. Thus, antecedent predictions ought to receive more attention than subsequent predictions. To this end, in this paper, we propose an effective method, named APTRANX (Antecedent Prioritized TRANX), on the basis of TRANX. APTRANX contains an Antecedent Prioritized (AP) Loss, which helps the model attach importance to antecedent predictions by exploiting the position information of the generated AST nodes. With better antecedent predictions and accompanying subsequent predictions, APTRANX significantly improves the performance. We conduct extensive experiments on several benchmark datasets, and the experimental results demonstrate the superiority and generality of our proposed method compared with the state-of-the-art methods.
翻译:代码生成侧重于将自然语言(NL)的言语自动转换成代码片片,但根据AST的制约因素,Seq2Tree模型很难在不正确的前的预测基础上提出正确的后期预测,例如TRANX, 以代码生成的代码的可兼容性为保证,从而产生随后的Sact 语库节点,产生随后的Seph 树(AST)节点,产生随后的Sact 语句节点,依赖对AST节点的先先前预测。现有的Seq2Tree 方法倾向于同样地处理前期预测和随后的预测。然而,在AST的制约下,Seq2Treet模型很难在不正确的前先预测的基础上提出正确的后期预测(Seq2Treet)方法,例如TRAX,因此,先先期的预测应该比随后的预测得到更多的关注。为此,我们建议一个有效的方法,即APTRAX(先前排的优先排序,后又排列的),在TRAX提议的基础上,AP-AP-AP-AP-AP-AP-ATRX的先订国家损失,有助于使模型更加重视该模型的比比前的预测的重要性,并随后的预测更加重视。