失灵器械神经编码 (Fault-Aware Neural Code Rankers)

Large language models (LLMs) have demonstrated an impressive ability to generate code for various programming tasks. In many instances, LLMs can generate a correct program for a task when given numerous trials. Consequently, a recent trend is to do large scale sampling of programs using a model and then filtering/ranking the programs based on the program execution on a small number of known unit tests to select one candidate solution. However, these approaches assume that the unit tests are given and assume the ability to safely execute the generated programs (which can do arbitrary dangerous operations such as file manipulations). Both of the above assumptions are impractical in real-world software development. In this paper, we propose CodeRanker, a neural ranker that can predict the correctness of a sampled program without executing it. Our CodeRanker is fault-aware i.e., it is trained to predict different kinds of execution information such as predicting the exact compile/runtime error type (e.g., an IndexError or a TypeError). We show that CodeRanker can significantly increase the pass@1 accuracy of various code generation models (including Codex, GPT-Neo, GPT-J) on APPS, HumanEval and MBPP datasets.

翻译：大型语言模型(LLMS) 展示了为各种编程任务生成代码的惊人能力。在许多情况下, LLMS 能够产生一个正确的程序程序, 当给一个任务做无数次试验时, 因此, 最近的趋势是使用一个模型对程序进行大规模取样, 然后根据程序执行的少量已知单位测试对程序进行过滤/排序, 以选择一个候选解决方案。但是, 这些方法假定单位测试是给定的, 并假定能够安全执行生成的程序( 它可以进行任意的危险操作, 如文件操作操作等 ) 。上述两种假设在现实世界软件开发中是不切实际的。我们在此文件中提议 CodRanker, 是一个神经级排名器, 可以预测抽样程序是否正确, 而不用执行它。我们的 CodeRanker是错觉察觉, 也就是说, 它受过训练, 可以预测不同类型的执行信息, 例如预测准确的编程/运行错误类型( 例如, 索引错误或类型错误) 。我们表明, CodeRanker 能够大大提高各种代码生成模型( 包括 Codex, GPT-NE) 和 GPT- APPT) 的数据。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

专知会员服务

31+阅读 · 2019年11月25日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日