自然语文指导方案拟订 (Natural Language-guided Programming)

In today's software world with its cornucopia of reusable software libraries, when a programmer is faced with a programming task that they suspect can be completed through the use of a library, they often look for code examples using a search engine and then manually adapt found examples to their specific context of use. We put forward a vision based on a new breed of developer tools that have the potential to largely automate this process. The key idea is to adapt code autocompletion tools such that they take into account not only the developer's already-written code but also the intent of the task the developer is trying to achieve next, formulated in plain natural language. We call this practice of enriching the code with natural language intent to facilitate its completion natural language-guided programming. To show that this idea is feasible we design, implement and benchmark a tool that solves this problem in the context of a specific domain (data science) and a specific programming language (Python). Central to the tool is the use of language models trained on a large corpus of documented code. Our initial experiments confirm the feasibility of the idea but also make it clear that we have only scratched the surface of what may become possible in the future. We end the paper with a comprehensive research agenda to stimulate additional research in the budding area of natural language-guided programming.

翻译：在当今的软件世界中,有可再使用软件图书馆的角形,当程序员面临他们怀疑可以通过使用图书馆完成的编程任务时,他们往往会寻找使用搜索引擎的代码示例,然后根据具体使用环境手工调整发现的实例。我们提出了一个基于新一代开发工具的愿景,这些开发工具有可能在很大程度上使这一过程自动化。关键的想法是修改代码自动完成工具,使其不仅考虑到开发者已经编写的代码,而且考虑到开发者正在试图实现下一个以自然语言编制的编程任务的意图。我们称之为用自然语言丰富代码的做法,目的是为完成自然语言指导的编程提供便利。为了表明这一理念是可行的,我们设计、实施和基准使用一个工具,在特定领域(数据科学)和特定编程语言(Python)中解决这一问题。工具的核心是使用经过大量有文件记录的代码培训的语言模型。我们最初的实验证实了这一理念的可行性,但也清楚地表明,我们只是用自然语言来丰富了该代码的自然语言的精髓研究领域,因此我们只能从新的方向研究领域入未来。

相关内容

粤港澳大湾区数字经济研究院

关注 1

粤港澳大湾区数字经济研究院是一家面向人工智能、数字经济产业和前沿科技的国际化创新型研究机构，坐落于深圳市深港科技创新合作区内。IDEA正与 MSR、Google Brain、DeepMind、OpenAI 等同行者一起推动人类 AI 技术前沿的发展。IDEA 的使命是立足社会需求，研发颠覆式创新技术并回馈社会，让更多的人从数字经济发展中获益。IDEA 秉承共享共赢共生的开源开放精神，积极营造自由而富有激情的创新工作环境，聚集全世界最聪慧的大脑一起创造人类社会最需要的价值。我们坚持科技擎天，产业立地，相信最好的研究从需求中来，到需求中去，最终惠及广大企业和受众。 IDEA 目前已聚集一批包括院士、世界著名大学教授、世界知名开源系统发明人在内的国际一流技术专家，致力于在 AI 基础技术与开源系统、人工智能金融科技、区块链技术与可信计算、企业级 AI 系统、产业智能物联网与智能机器人等领域研发国际顶尖成果，并培育一批国际领先科技企业，带动深圳乃至大湾区万亿级数字经济产业发展。 AIPT（AI 平台技术研究中心）致力于建设支撑人工智能算法、算力和数据的平台，通过具体项目的研发、实施和部署来推进 AI 技术的落地和产业化，团队成立以来，已发布 ReadPaper 论文阅读平台、BIOS 医疗知识图谱两款产品。AIPT 负责人-谢育涛曾任微软公司技术合伙人兼微软（中国）操作系统工程院院长。谢育涛在微软公司工作 20 余年，先后在微软美国总部的 Microsoft Office 产品组、必应团队、微软亚洲互联网工程院以及微软（中国）操作系统工程院、人工智能和云计算等多个研发部门担任重要职务。他在操作系统、搜索技术、人工智能、应用及服务领域拥有丰富的技术与管理经验。

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

【干货书】Python程序员编程，810页pdf，Python® for Programmers

专知会员服务

62+阅读 · 2020年8月6日

【经典书】贝叶斯编程，378页pdf，Bayesian Programming

专知会员服务

250+阅读 · 2020年5月18日