试运行联合驾驶和编码:热温、冷感或黑魔法? (Piloting Copilot and Codex: Hot Temperature, Cold Prompts, or Black Magic?)

Language models are promising solutions for tackling increasing complex problems. In software engineering, they recently attracted attention in code assistants, with programs automatically written in a given programming language from a programming task description in natural language. They have the potential to save time and effort when writing code. However, these systems are currently poorly understood, preventing them from being used optimally. In this paper, we investigate the various input parameters of two language models, and conduct a study to understand if variations of these input parameters (e.g. programming task description and the surrounding context, creativity of the language model, number of generated solutions) can have a significant impact on the quality of the generated programs. We design specific operators for varying input parameters and apply them over two code assistants (Copilot and Codex) and two benchmarks representing algorithmic problems (HumanEval and LeetCode). Our results showed that varying the input parameters can significantly improve the performance of language models. However, there is a tight dependency when varying the temperature, the prompt and the number of generated solutions, making potentially hard for developers to properly control the parameters to obtain an optimal result. This work opens opportunities to propose (automated) strategies for improving performance.

翻译：语言模型是解决日益复杂的问题的有希望的解决方案。在软件工程中,它们最近吸引了代码助理的注意力,其程序从自然语言的编程任务描述中自动用特定编程语言写成,在写代码时有可能节省时间和精力。然而,这些系统目前理解不善,因此无法最佳地使用这些系统。在本文件中,我们调查了两种语言模型的各种输入参数,并进行了一项研究,以了解这些输入参数的变异(例如,程序任务描述和周围环境、语言模型的创造性、生成的解决方案的数量)是否会对生成程序的质量产生重大影响。我们设计了不同的输入参数,并将其适用于两个代码助理(Coopultur and Code)和两个代表算法问题的基准(HumanEval and LeetCode)。我们的结果显示,不同的输入参数可以大大改善语言模型的性能。然而,当温度、迅速性和生成的解决方案的数量变化时,对于开发者可能很难正确控制参数以取得最佳结果。这项工作开启了提出(自动化)战略的机会。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日