Code completion tools are frequently used by software developers to accelerate software development by suggesting the following code elements. Completing a sequence of code tokens (e.g., a full line of code) has been proved more efficient than predicting a single token at a time. To complete the code sequence, researchers are employing AutoRegressive (AR) decoders to generate tokens in a left-to-right, token-by-token fashion. Consequently, the prediction of the next token depends on all previously generated tokens, which leads to high latency in inference. To improve the efficiency and accuracy of full-line code completion, in this paper, we propose a Non-AutoRegressive (NAR) model for code completion boosted by a syntax-aware sampling strategy. Our experimental results on two widely used datasets suggest that our model outperforms both AR and NAR baselines on full-line code completion, and it is faster than the AR model with up to 9 times speed-up.
翻译:软件开发者经常使用代码完成工具来加速软件开发,方法是建议下列代码元素。 完成一组代码符号( 如一行代码) 的效率比一次预测一个符号的效率要高。 为了完成代码序列, 研究人员正在使用自动递减解解码器以左对右、 象征性方式生成符号。 因此, 下一个符号的预测取决于所有先前生成的标牌, 从而导致高度延迟的推断。 为了提高完整代码完成的效率和准确性, 在本文件中, 我们提议了一个非自动递减( NAR) 模式, 以便通过同步觉醒抽样战略来推进代码完成。 我们在两个广泛使用的数据集上的实验结果表明, 我们的模式在全线代码完成时超过了AR和NAR的基线, 并且比AR 模型更快, 速度高达9倍。