Currently, large pre-trained language models are widely applied in neural code completion systems. Though large code models significantly outperform their smaller counterparts, around 70% displayed code completions from Copilot are not accepted by developers. Being reviewed but not accepted, their help to developer productivity is considerably limited. Even worse, considering the high cost of the large code models, it is a huge waste of computing resources and energy. To fill this significant gap, we first investigate the prompts of unhelpful code completions, and empirically find four observable patterns that cause such prompts, all of which are inherent, namely, they can hardly be addressed by improving the accuracy of the model. This demonstrates the feasibility of identifying such prompts based on the prompts themselves. Motivated by this finding, we propose an early-rejection mechanism to turn down low-return prompts by foretelling the code completion qualities without sending them to the code completion system. Furthermore, we propose a lightweight Transformer-based estimator to demonstrate the feasibility of the mechanism. The experimental results show that the proposed estimator helps save 23.3% of computational cost measured in floating-point operations for the code completion systems, and 80.2% of rejected prompts lead to unhelpful completion
翻译:目前,在神经编码完成系统中广泛应用了大型预先培训的语言模型。虽然大型代码模型明显优于较小的模型,但70%左右显示的副驾驶完成代码却不被开发者所接受。经过审查但不被接受,它们对开发者生产率的帮助非常有限。更糟糕的是,考虑到大型代码模型成本高昂,这是一个巨大的计算资源和能源浪费。为了填补这一重大空白,我们首先调查无益代码完成的迅速性,并经验性地发现导致这种快速的四种可见模式,所有这些模式都是内在的,即它们很难通过提高模型的准确性来加以解决。这显示了根据这些提示本身来查明此类提示的可行性。由于这一发现,我们提议了一个早期拒绝机制,通过在不将代码完成质量发送到代码完成系统的情况下,将代码完成质量降低低回报速度。此外,我们提议建立一个基于轻量的转换器的估算器,以展示这种机制的可行性。实验结果显示,拟议的估算器有助于将计算成本的23.3%保存到在浮动完成状态中测量的80.2%的计算成本及时性完成系统。