Currently, large pre-trained models are widely applied in neural code completion systems, such as Github Copilot, aiXcoder, and TabNine. Though large models significantly outperform their smaller counterparts, a survey with 2,631 participants reveals that around 70\% displayed code completions from Copilot are not accepted by developers. Being reviewed but not accepted, these completions bring a threat to productivity. Besides, considering the high cost of the large models, it is a huge waste of computing resources and energy, which severely goes against the sustainable development principle of AI technologies. Additionally, in code completion systems, the completion requests are automatically and actively issued to the models as developers type out, which significantly aggravates the workload. However, to the best of our knowledge, such waste has never been realized, not to mention effectively addressed, in the context of neural code completion. Hence, preventing such profitless code completions from happening in a cost-friendly way is of urgent need. To fill this gap, we first investigate the prompts of these completions and find four observable prompt patterns, which demonstrate the feasibility of identifying such prompts based on prompts themselves. Motivated by this finding, we propose an early-rejection mechanism to turn down low-return prompts by foretelling the completion qualities without sending them to the LCM. Further, we propose a lightweight Transformer-based estimator to demonstrate the feasibility of the mechanism. The experimental results show that the estimator rejects low-return prompts with a promising accuracy of 83.2%.
翻译:目前,大量预先培训的模型广泛应用于神经代码完成系统,如Github Copil、AiXcoder和TabNine。虽然大型模型大大优于其较小的模型,但有2,631名参与者的调查显示,大约70 ⁇ 显示的Copil的代码完成没有被开发者接受。经过审查但未被接受,这些完成对生产力构成威胁。此外,考虑到大型模型成本高昂,这是对计算资源和能源的巨大浪费,严重违背了AI技术的可持续发展原则。此外,在代码完成系统中,完成请求是自动和积极地发给模型的,因为开发者选择了准确性,大大加重了工作量。然而,据我们所知,这种废物从未实现,更不用说在神经代码完成过程中有效解决了。因此,迫切需要防止这种无利润的代码完成以成本友好的方式进行。为了填补这一空白,我们首先调查这些低质量完成的迅速性,并找到四种可观测的快速模式,这显示了在开发者选择精确性的基础上确定这种速度的可行性,从而大大地加重了工作量。我们提议,将一个快速的升级机制转化为。